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PATENT 
Customer No. 32425 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

In re Application of: 

Peter DROGE, Nicole CHRIST and Group Art Unit: 1 636 

Elke LORBACH 

Examiner: Q. Nguyen 

Serial No.: 10/082,772 

Atty. Dkt. No.: DEBE:008US/SLH 

Filed: February 22, 2002 

For: SEQUENCE-SPECIFIC DNA RECOMBI- Confirmation No.: 4391 
NATION IN EUKARYOTIC CELLS 



APPEAL BRIEF 



Commissioner for Patents 
P.O. Box 1450 

Alexandria, VA 22313-01450 
Dear Sir: 

This Appeal Brief is filed in response to the Office Action mailed on February 21, 2008. 
Appellant's brief is due May 21, 2008. Notice of Appeal is filed concurrently herewith. Also 
included herewith is the fee for the notice of appeal and brief. No other fees are believed due in 
connection with this filing; however, should appellant's payment be missing or deficient, or 
should any fees be due, appellants authorize the Commissioner to debit Fulbright & Jaworski 
L.L.P. Deposit Account No. 50-1212/DEBE:008US. 



I. Real Party In Interest 

The real party in interest is the assignee, Peter DrSge. 
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II. Related Appeals and Interferences 

There are no related appeals or interferences. 

III. Status of the Claims 

Claims 1-28 were filed with the original application, but these claims were canceled in a 
preliminary amendment in favor of new claims 29-60. Claims 29-51 and 58 were elected in a 
response to restriction reqxiirement, and claims 31, 40-42, 52-57, 59 and 60 were canceled during 
prosecution. Thus, claims 29, 30, 32-39, 43-51 and 58 are pending in the application, stand 
rejected and are appealed. A copy of the appealed claims is attached as Appendix A. 

IV. Status of the Amendments 

No unentered amendments have been offered. 

V. Summary of the Claimed Subject Matter 

Independent claim 29, drawn to a method of sequence specific recombination of DNA in 
a eukaryotic cell using attB, attP, attR and attL sequences, along with wild-type or int-h or int- 
h/218 integrases, is supported in the application as filed at page 8, line 26, to page 9, line 6, and 
page 14, lines 9-14. 

VI. Ground of Reiection to be Reviewed on Appeal 

1. Are claims 29, 30, 32, 33, 36, 38, 44-48 and 58 obvious under 35 U.S.C. §103 
over the combined disclosures of Crouzet et al (Exhibit 1) and Christ & DrSge 
(Exhibit 2). 

2. Are claims 29 and 43 obvious under 35 U.S.C. §103 over Crouzet et al, Christ & 
DrOge and Capecchi et al. (Exhibit 3). 

3. Are claims 29, 34-37 and 39 obvious under 35 U.S.C. §103 over Crouzet et al, 
Christ & Droge and Hartley et al (Exhibit 4). 
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VII. Argument 

A. Standard of Review 

Findings of fact and conclusions of law by the U.S. Patent and Trademark Office must be 
made in accordance with the Administrative Procedure Act, 5 U.S.C. §706(A), (E), 1994. 
Dickinson v. Zurko, 527 U.S. 150, 158 (1999). Moreover, the Federal Circuit has held that 
findings of fact by the Board of Patent Appeals and Interferences must be supported by 
"substantial evidence" within the record. In re Gartside, 203 F.3d 1305, 1315 (Fed. Cir. 2000). 
In In re Gartside, the Federal Circuit stated that "the 'substantial evidence' standard asks 
whether a reasonable fact finder could have arrived at the agency's decision." Id. at 1312. 
Accordingly, it necessarily follows that an examiner's position on appeal must be supported by 
"substantial evidence" within the record in order to be upheld by the Board of Patent Appeals 
and Interferences. 

B. Rejections Under 35 U.S. C. §103 

i. Crouzet et ai and Christ & Droge 

Claims 29, 30, 32, 33, 36, 38, 44-48 and 58 remain rejected over the combined 
disclosures of Crouzet et al. and Christ & Droge. The examiner states that the skilled artisan 
would have modified the method taught by Crouzet et al. by utilizing the mutant lambda 
integrases Int-h and Int-h/218 described in Christ & DrSge for their method of generating 
chimeric DNA. Appellants maintain that the rejection is improper, as discussed below. 

As has been pointed out in each of appellants' previous responses, Crouzet et al. worked 
with wild-type integrases in eukaryotic cells, while Christ & Droge worked in prokaryotic 
systems with mutant integrases. It is appellants' position that there was no a priori expectation 
of success, even if motivation for combining these two very distinct systems were presumed. 
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For most of the prosecution, the examiner simply states that "an ordinary skilled artisan would 
have a reasonable expectation of success in light of the cited references." As a matter of law, 
such a statement is legally insufficient to establish obviousness, particularly in light of the 
countervailing declaratory evidence: 

The legal concept of prima facie obviousness is a procedural tool of examination 
which applies broadly to all arts. It allocates who has the burden of going forward with 
production of evidence in each step of the examination process. See In re Rinehart, 531 
F.2d 1048, 189 USPQ 143 (CCPA 1976); In re Linter, 458 F.2d 1013, 173 USPQ 560 
(CCPA 1972); In re Saunders, 444 F.2d 599, 170 USPQ 213 (CCPA 1971); In re Tiffin, 
443 F.2d 394, 170 USPQ 88 (CCPA 1971), amended, 448 F.2d 791, 171 USPQ 294 
(CCPA 1971); In re Warner, 379 F.2d 1011, 154 USPQ 173 (CCPA 1967), cert, denied, 
389 U.S. 1057 (1968). The examiner bears the initial burden of factually supporting 
any prima facie conclusion of obviousness. If the examiner does not produce a prima 
facie case, the applicant is under no obligation to submit evidence of nonobviousness. If, 
however, the examiner does produce a prima facie case, the burden of coming forward 
with evidence or arguments shifts to the applicant who may submit additional evidence of 
nonobviousness, such as comparative test data showing that the claimed invention 
possesses improved properties not expected by the prior art. The initial evaluation of 
prima facie obviousness thus relieves both the examiner and applicant from evaluating 
evidence beyond the prior art and the evidence in the specification as filed until the art 
has been shown to suggest the claimed invention. 

To reach a proper determination under 35 U.S.C. 103, the examiner must step 
backward in time and into the shoes worn by the hypothetical "person of ordinary skill in 
the art" when the invention was unknown and just before it was made. In view of all 
factual information, the examiner must then make a determination whether the claimed 
invention "as a whole" would have been obvious at that time to that person. Knowledge 
of applicant's disclosure must be put aside in reaching this determination, yet kept in 
mind in order to determine the "differences," conduct the search and evaluate the "subject 
matter as a whole" of the invention. The tendency to resort to "hindsight" based upon 
applicant's disclosure is often difficult to avoid due to the very nature of the examination 
process. However, impermissible hindsight must be avoided and the legal conclusion 
must be reached on the basis of the facts gleaned from the prior art. . .. 

To establish a prima facie case of obviousness, three basic criteria must be met. 
First, there must be some suggestion or motivation, either in the references themselves or 
in the knowledge generally available to one of ordinary skill in the art, to modify the 
reference or to combine reference teachings. Second, there must be a reasonable 
expectation of success. Finally, the prior art reference (or references when combined) 
must teach or suggest all the claim limitations. The teaching or suggestion to make the 
claimed combination and the reasonable expectation of success must both be found in 
the prior art, and not based on applicant's disclosure. In re Vaeck, 947 F.2d 488, 20 
USPQ2d 1438 (Fed. Cir. 1991). See MPEP § 2143 - § 2143.03 for decisions pertinent to 
each of these criteria .... 
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When an applicant submits evidence, whether in the specification as originally 
filed or in reply to a rejection, the examiner must reconsider the patentability of the 
claimed invention. The decision on patentability must be made based upon consideration 
of all the evidence, including the evidence submitted by the examiner and the evidence 
submitted by the applicant. A decision to make or maintain a rejection in the face of all 
the evidence must show that it was based on the totality of the evidence. Facts established 
by rebuttal evidence must be evaluated along with the facts on which the conclusion of 
obviousness was reached, not against the conclusion itself. In re Eli Lilly & Co., 902 F.2d 
943, 14 USPQ2d 1741 (Fed. Cir. 1990). 

MPEP §2142 (emphasis added). Here, the examiner simply cites to the references generally and 

to the "skill in the art" - those are not "evidence," and are not sufficient to establish a prima facie 

case. 

In the most recent office action, the examiner attempts to argue that the record does, in 
fact, evidence that the operability of modified int's in eurkaryotic cells could have been 
predicted. This argument rests solely upon the alleged statement that "the conditions required by 
wild-type lambda integrase to mediate recombination actions in prokaryotic cells, under 
physiologic conditions and in vitro conditions are apparently more stringent than those 
required by Int-h Office Action of February 21, 2008, paragraph bridging pages 8-9 

(emphasis added). The evidence to support this statement comes, supposedly from Hartley et al, 
Christ & Droge, and Lange-Gustafson et al. (Exhibit 5). As will be explained below, nothing 
could be further from the truth. 

First, Hartley et al. makes no mention of modified integrases. Thus, is in an impossibility 
for Hartley to provide any comparative statement about the stringency of a modified int's 
activity relative to that of wild-type lamda integrases. 

Second, the relevance of Lange-Gustafson et al. has been effectively xmdercut by 
appellants' response on February 14, 2007, where it was stated that "considering [that] the data 
of Lange-Gustafson that allegedly shows that Int-h works better with supercoiled DNA, 
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applicants submit that the studies described in this paper were performed in vitro and reflect 
conditions (25 °C and KCl at 25 mM) that have nothing to do with the environment inside a 
living eukaryotic cell." The subsequent Office Action (Advisory Action of February 27, 2007), 
completely misses the point in arguing that "[t]here is no factual evidence that Int-h would not be 
able to use supercoiled DNA at all ... physiological conditios or inside a eukaryotic cell . . . ." It 
is not appellants' duty to disprove obviousness when the rationale underlying the rejection 
lacks scientific merit. Rather, it is incumbent upon the examiner to explain why the defects in 
the reference he cited are of no moment, not the other way arovind. 

Third, with respect to Christ & Droge, appellants have repeatedly emphasized that the 
work reported in this reference on modified integrases was all performed in prokaryotic cells. 
Whatever the requirements observed in such cells, they cannot be extrapolated to the activity in 
eukaryotic cells. Much as with Hartley, it is impossible for a reference to compare the activity 
of wild-type with that of mutant integrases when only one of the two enzymes is discussed. 

Moreover, the examiner has dismissed evidence provided by appellants, and instead 
substituted his own unsupported conclusions on expectation of success. In the Rule 132 
declaration submitted by the inventor (Exhibit 6), Dr. DrSge explained that the skilled artisan 
could not predict the success of using modified integrases in eukaryotic cells for the simple 
reason that it is well known that the organization of the prokaryotic genome is distinct from 
eukaryotics. Whereas the prokaryotic genome is circular and condensed due to negative 
supercoiling and architectural proteins like IHF, the eukaryotic genome is comprised of linear 
DNA molecules which are highly condensed in nucleosomes by histone proteins. The skilled 
artisan knows that lambda integrase-mediated recombination is highly dependent on the 
topological status of the DNA to be recombined and distinct accessory factors. In particular, 
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integrase mediated recombination is dependent on distinct bending specificities of the DNA to 
allow the formation of DNA/protein complexes which finally give rise to the recombination 
event (see Christ &, DrOge, p. 826, left col, 2nd para to right col. 2nd para). 

Without the aid of topologically underwound DNA, which exists only in prokaryotic 
cells, it was reasonable to assume that mutant Int proteins cannot function. Thus, although the 
modified integrases of Christ & Droge, which are adapted to work without the DNA-stabilizing 
factor IHF and the enzyme Xis in prokaryotic cells {i.e., having a prokaryotic DNA substrate), 
there was no reasonable basis for the skilled artisan to would also work in eukaryotic cells, 
/.e., having a eukaryotic DNA substrate. The examiner was requested to provide any 
countervailing evidence within his personal knowledge under 37 CFR §1.1 04(d)(2); MPEP 
§2144.03. In response, the examiner only argued in response that other than the difference in 
DNA topology, no other facts were provided. He then merely reverted to the same argument that 
is said to support his conclusions - that wild-type integrases have more stringent requirements 
than those of modified int's. However, as explained in detail above, that statement only applies 
in the context prokaryotic cells, which is not claimed here. There could not be a more clear 
example of examiner simply substituting his own subject beliefs for those of the appellants. 
Moreover, the only fact of record supports appellants' position regarding unpredictability - and 
that alone is sufficient to undercut the examiner's position. 

Thus, it is submitted, again, that the basic premise for the rejection - that one can assume 
that modified integrase will operate in eukaryotic cells based on their performance in prokaryotic 
cells - is flawed as a matter of scientific principle. It could very well have been the case that 
what made them allegedly work "better" in prokaryotic cells, as compared to wild-type integrase, 
would have worked against them in the context of eukaryotic cells. This is simply the nature of 
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biological systems, where unpredictability is readily acknowledged by the jBeld. For these 
reasons, and as set forth above, reversal of this rejection is requested. 

iL Crouzet et aL, Christ & Droge, and Capecchi et aL 
Claims 29 and 43 remain rejected over Crouzet et al, Christ & DrOge and Capecchi et al. 
Just as with the previous rejections, applicants submit that the rejection here fails for lack of 
motivation and lack of an expectation of success. The defects of Crouzet et al. and Christ & 
Droge have been discussed above and will not be repeated here. Capecchi et al, which simply is 
cited for a "positive-negative selector vector," fails to address the issue of whether modified 
integrases would work in eukaryotic cells, as set out in detail above. Thus, again, there was no 
motivation for combining the primary and secondary references, and even if there were, there 
was no likelihood of success that they would work together. 

Thus, for the reasons set forth above, reversal of this rejection also is respectfully 
requested. 

UL Crouzet et aL, Christ & Droge, and Hartley et aL 
Claims 29, 34-37 and 39 stand rejected over Crouzet et al, Christ & Droge and Hartley et 
al Just as with the previous rejections, applicants submit that the rejection here fails for lack of 
motivation and lack of an expectation of success. The defects of Crouzet et al and Christ & 
Droge have been discussed above and will not be repeated here. Hartley et al, which teaches 
recombinational methods in prokaryotic and eukaryotic host cells using, inter alia, the lambda 
integrase recombination system with exclusively the wild-type lambda integrase, fails to address 
the issue of whether modified integrases would work in eukaryotic cells, as set out in detail 
above. Thus, again, there was no motivation for combining the primary and secondary 
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together. 

Thus, for the reasons set forth above, reversal of this rejection also is respectfully 
requested. 

C. Conclusion 

In light of the foregoing, appellant respectfully submits that all pending claims are non- 
obvious under 35 U.S.C. §103. Therefore, it is respectfully requested that the Board reverse each 
of the pending rejections. 



Date: May 21. 2008 




lly submitted, 



[ighlander 



Fulbright & Jaworski L.L.P. 
600 Congress Ave., Suite 2400 
Austin TX 78701 
512-474-5201 
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VIII. APPENDIX A - APPEALED CLAIMS 



29. A method of sequence specific recombination of DNA in a eukaryotic cell, comprising: 

(a) providing said eiikaryotic cell, said cell comprising a first DNA segment 
integrated into the genome of said cell, said first DNA segment comprising an 
attB sequence according to SEQ ID N0:1 or a derivative thereof, an att? 
sequence according to SEQ ID N0:2 or a derivative thereof, an attL sequence 
according to SEQ ID N0:3 or a derivative thereof, or an attR sequence according 
to SEQ ID N0:4 or a derivative thereof; 

(b) introducing a second DNA segment into said cell, wherein if said first DNA 
segment comprises an attB sequence according to SEQ ID N0:1 or a derivative 
thereof, said second DNA segment comprises an att? sequence according to SEQ 
ID N0:2 or a derivative thereof, wherein if said first DNA segment comprises an 
attP sequence according to SEQ ID N0:2 or a derivative thereof, said second 
DNA segment comprises an attB sequence according to SEQ ID N0:1 or a 
derivative thereof, wherein if said first DNA segment comprises an attL sequence 
according to SEQ ID NO: 3 or a derivative thereof said second DNA segment 
comprises an attR sequence according to SEQ ID N0:4 or a derivative thereof, or 
wherein if said first DNA segment comprises an attR sequence according to SEQ 
ID N0:4 or a derivative thereof said second DNA segment comprises an attL 
sequence according to SEQ ID NO: 3 or a derivative thereof; and 

(c) further comprising providing to said cell a modified bacteriophage lambda 
integrase Int, wherein said modified Int is Int-h or Int-h/218, which induces 
sequence specific recombination through said attB and att? or attR and attL 
sequences. 

30. The method of claim 29, wherein said first DNA segment was introduced into the 
genome of said cell by recombinant methods. 

32. The method of claim 29, wherein said first DNA segment comprises an attB sequence 
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according to SEQ ID N0:1 or a derivative thereof, and said second DNA comprises an 
attP sequence according to SEQ ID N0:2 or a derivative thereof. 

33. The method of claim 29, wherein said first DNA segment comprises an att? sequence 
according to SEQ ID N0:2 or a derivative thereof, and said second DNA comprises an 
attB sequence according to SEQ ID N0:1 or a derivative thereof 

34. The method of claim 29, wherein said first DNA segment comprises an attL sequence 
according to SEQ ID NO: 3 or a derivative thereof, and said second DNA sequence 
comprises an attR sequence according to SEQ ID N0:4 or a derivative thereof, further 
comprising, in step (d), providing to said cell a Xis factor. 

35. The method of claim 29, wherein said first DNA segment comprises an attR sequence 
according to SEQ ID N0:4 or a derivative thereof, and said second DNA sequence 
comprises an attL sequence according to SEQ ID N0:3 or a derivative thereof, further 
comprising, in step (d), providing to said cell a Xis factor. 

36. The method of claim 29, further comprising providing to said cell a third DNA segment 
comprising an Int gene. 

37. The method of claim 36, further comprising providing to said cell a fourth DNA segment 
comprising a Xis factor gene, respectively. 

38. The method of claim 36, wherein said third DNA segment further comprises a regulatory 
sequence effecting a spatial and/or temporal expression of the Int gene. 

39. The method of claim 37, wherein said fourth DNA segment further comprises a 
regulatory sequence effecting a spatial and/or temporal expression of the Xis factor gene. 

43. The method according to claim 29, wherein said first and/or second DNA segment further 
comprise a sequence effecting integration of said first and/or second DNA segment into 
the genome of said cell by homologous recombination. 

44. The method of claim 29, wherein said first and/or second DNA segment further 
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comprises a sequence coding for a polypeptide of interest. 

45. The method of claim 44, wherein said polypeptide of interest is a structural protein, an 
endogenous or exogenous enzyme, a regulatory protein or a marker protein. 

46. The method of claim 29, wherein said first and second DNA segment are introduced into 
the eukaryotic cell on the same DNA molecule. 

47. The method of claim 29, wherein said eukaryotic cell is a mammalian cell. 

48. The method of claim 47, wherein said mammalian cell is a human, simian, mouse, rat, 
rabbit, hamster, goat, bovine, sheep or pig cell. 

49. The method of claim 29, further comprising: 

(d) performing a second sequence specific recombination of DNA by Int-h or Int- 
h/218 and a Xis factor after the steps (a)-(c), wherein said first DNA sequence 
comprises said attB sequence according to SEQ ID N0:1 or a derivative thereof 
and said second DNA sequence comprises the att? sequence according to SEQ ID 
N0:2 or a derivative thereof, or wherein said first DNA sequence comprises said 
att? sequence according to SEQ ID N0:2 or a derivative thereof and said second 
DNA sequence comprises the attB sequence according to SEQ ED N0:1 or a 
derivative thereof . 

50. The method of claim 49, further introducing a further DNA sequence into said cells, the 
further DNA sequence comprising a Xis factor gene. 

51. The method of claim 50, wherein said further DNA sequence comprises further a 
regulatory DNA sequence effecting a spatial and/or temporal expression of said Xis 
factor gene. 

58. An isolated eukaryotic cell obtainable according to the method of claim 29. 
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IX. APPENDIX B - EVIDENCE CITED 



Exhibit 1 - Crouzet et al. 
Exhibit 2 - Christ 8c DrOge et al. 
Exhibit 3 - Capecchi et al. 
Exhibit 4 - Hartley et al. 
Exhibit 5 - Lange-Gustafson et al. 
Exhibit 6 - Declaration of Peter Droge 
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X. APPENDIX C- RELATED PROCEEDINGS 

None 
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Reference: DEBE:008US 
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EXHIBITS 



VoL2S9,No.20.I««TO<lfOc 



«r 25, pp. 12^4-12732. 1384 



Purification and Properties of Int-h, a Variant Protein Involved in 
Site-specific Recombination of Bacteriophage X* 

(Received for publication, Maich 13, 1984) 

Brenda J. Lange-Gustafsonij: and Howard A. Nashf 

From the Laboratory of Neurochemistry, NcUional Institute of Mental Health, Bethesda, Maryland 2020S 



Under physiological conditions, integration of X DNA 
into the Escherichia colt chromosome requires the di- 
rect participation of only two proteins, the viral int 
gene product and E. coli integration host factor (IHF). 
A variant of the int gene has been isolated that permits 
integrative recombination in cells mutant for one of 
the two subunits of IHF (Miller, H. I.. Mozola, M. A., 
and Friedman, D. L (1980) Cell 20, 721-729). In the 
present work, we have purified Int-h, the product of 
this variant gene. In contrast to the wild-type int gene 
product (Int*), which produces almost no recombinants 
in the absence of IHF, purified Int-h protein sponsors 
reduced but significant levels of integrative recombi- 
nation in the absence of any E. coli supplement. This 
shows that the int gene encodes all the information 
necessary for the elementary steps in recombination 
and implies that IHF functions as an accessory protein. 

When supplemented by IHF, recombination pro- 
moted by Int-h resembles that promoted by Int* in 
kinetics, stoichiometry of Int and IHF, and nature of 
the recombinant product. Under these conilitions, Int- 
h uses supercoiled DNA more effectively than nonsu- 
percoiled DNA as a substrate for recombination, as 
does Int*. However, in the absence of IHF, Int-h recom- 
bines supercoiled and aonsupercoiled substrates iden- 
tically, indicating that IHF is an important part of the 
mech^iism that senses the supercoiled state of the sub- 
strate DNA during recombination. A surprising differ- 
ence in recombination carried out by Int-h in the pres- 
ence or absence of IHF concerns the degree to which 
sites on the same circle recombine with one another as 
opposed to sites on sister moleicules. In the presence of 
IHF, Int*h favors intramolecular recombination, as 
does Int*. However, in the absence of IHF, Int-h almost 
exclusively promotes intermolecular recombination. 



Bacteriophage X has a specialized_system for the integration 
of viral DNA into the bacterial chromosome. This system 
carries out a reciprocal recombination between a specific viral 
site, attP, and a specific host site, ottB. The sequences and 
functional extents of these sites are known (for a recent 
review, see Ref. 1). A combination of genetic and biochemical 
experim«its has shown that integrative recombination is car- 
ried out by two proteins: Int, the product of the viral int gene 



* The costs of publication of tiiis article were defrayed in part by 
the payment of page charges. This article must therefore be hereby 
marked "advertisement" in accordance with 18 U.S.C. Section 1734 
solely to indicate this fact. 

t Present address, Office of Extramural Project Review, National 
Institute of Mental Health. Bockville, MD 20857. 

§ Present address. Laboratory of Molecular Biology, National In- 
stitute of Mental Health, Builduig 36/Room 1B08. Bethesda. MD 
20205. 



and IHF,' a protein that is composed of two polypeptides, the 
products of the E. coli kimA and hip genes (1). Several studies 
have led to the conclusion that Int is the protein that carries 
out the breakage and rejoining steps in recombination. First, 
Int binds to the core region of aWP and ottB, the 15-base pair 
region of homology wherein the recombination crossover oc- 
curs (2). Moreover, Int has a topoisomerase activity that can 
relax supercoiled DNA (3) and can break attachment site 
DNA, albeit at low frequency, precisely at the nucleotides 
within the core that are involved in the crossover (4). Finally, 
Int can promote the exchange of a pair of strands in DNA 
assembUes that have been constructed to resemble recombi- 
nation intermediates (5). Although these findings suggest that 
the role of Int is simply to promote strand exchange, other 
data suggest that it has additional roles. Chemical modifica- 
tion of Int can destroy recombination activity while leaving 
binding to the core and relaxing activity unchanged (4, 6). In 
addition, Int binds to portions of attP that are exterior to the 
core; this binding to the so-called arms of ottP appears to be 
essential for recombination activity (2, 7). Analysis of the 
sequences protected by Int in the core and arm regions of attP 
indicates that Int is a bifuncfional protein that recognizes two 
distinct binding sequences (6, 8). 

The study of mutant proteins may be useful in dissecting 
the various ways in which Int protein promotes integrative 
recombination. In this paper, we beg^ the analysis of one 
variant, Int-h. This variant was isolated after selection for X 
bacteriophage that could undergo site-specific recombination 
in an E. coli host that was mutant for IHF (9). The mutation 
proved to map in the int gene and in vivo studies indicated 
that the int-h allele produced a protein with an enhanced 
recombination potential. For example, in a strain deleted for 
attB, int-h was superior to int* in promoting the integration 
of X into secondary bacterial sites. In addition, int-h showed 
altered recombination potential for excision, the removal of 
integrated viral- D^A' {9).' In viva studies have not revealed 
the basis of the enhanced recombination efficiency of the int- 
h allele. It might be that the Int-h protein is altered in its 
capacity to interact with IHF, its affinity for core or arm 
binding sequences, its tendency to form nucleosome-like 
structures at attachment sites (10, 11), its intrinsic topoisom- 
erase activity, etc. Since variation in any of these activities 
could provi<]e a valuable probe for tiie analy^ of the detuled 
mechanism of recombination, we have undertaken the study 
of the Int-h protein. Tins report presents our data on the 
cloning and purification of Int-h and our iiutial rraults on the 
characterization of the recombination capacity of this protein. 

'The abbreviations used are: IHF, integration host factor, SDS, 
sodium dodecsrl sul&te; TBME3>, iV,iV,N',N'-tetranietii}^ylene- 
diamine; Ub, kllobase; t^, base paSx. 
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EXPERIMENTAL PROCEDURES 



Bacteria and Bacteriophage— The bacteria used in this work were 
derivatives of the E. coUKVZ strain N99. Strain HN356 (constructed 
by R. A. Weisberg, National Institutes of Health, Bethesda, MD) 
contains the recB21 mutation. Strains K5185 and K572 (constructed 
by H. Miller, Genentech, Inc., San Francisco, CA) respectively con- 
tain a partial deletion, /iimA82, and a point mutation, fcimA42, in the 
gene for the a subunit of IHF. Strain K5248 (constructed by H. 
Miller) contains a point mutation, hipl57, in the gene for the 
subunit of IHF. Strain JD12 (constructed by K. Abremski, E. I. 
duPont deNemours & Co., Wilmington, DB) contains both the 
/umA42 and h^l51 mutations. 

Bacteriophage strain Y619 (constructed by H. Miller) is X h int-h 
intC226 cl857; it was grown in strain K5185 and individual isolates 
were tested for the absence of mt-promoted deletions by scoring 
sensttivify to EDTA (12). Bacteriopfaii^ strain G903 (constnicted by 
S. Adbya. National Institutes of Health, Bethesda, MD) is X attB- 
ttttP intZ xisl redlU imm434 cll cII28 clUeil. 

PAxsmids— Plasmid pRSF2:24 (13) is a derivative of colEl that 
contains the TnA tranaposon; it was obtained from L. Enquist, E. I. 
duPont deNemours & Co., Wihnington, DE. Plasmid pC22642 (14) 
is pRSF2124 containing a EcoKl insert from X iret C226;. pHN16 (this 
work) is the identical construct except that the ficoRI insert is from 
Y619. Recoihbination substrates were grown, labeled with ['H]thy- 
midine and purified as described (15). Plasmid pPAl, pBBlOS, and 
pBPl are described in Ref. 15; plasmid pBP86 and pBP90 are de- 
scribed in Ref. 16. A detailed restriction map of the attP insert that 
is common to pPAl, pBP86, and pBP90 can be found in Ref. 11. 

Proteins— Wild-type Int protein (Int*) was purified as described 
(17) from strain HN695, a derivative of strain K5185 containing the 
plasmid pC22642. Int-b proteui was purified from strain HN700, a 
derivative of strain K5185 containing the plasmid pHN16. Wild-type 
IHF was purified through Fraction V as described (IS) from strain 
HN356. Crude extracts of wiM-tgrpe and oratant IHF were made by 
- • B N99, K5185, K572, K5248, and JD12 as de- 
• • " »m Be- 



scribed (17). I 
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Typical recombination mixtures (27 ^I) contained 37 mM Tris-HCl 
(pH 7.4), 5.6 mM spermidine, 1.1 mM EDTA, 1.1 mg/ml bovine serum 
albumin, 25 to 75 mM KCl, purified IHF, and purified Int as indicated. 
In some reactions, purified IHF was replaced by 0.2 /d of sonic extract. 
The reaction mixtures also contained either 0.30 «ig of a plasmid 
containing both ottP and attB or 0.6 fig of an equimolar mixture of 
two plasmids, one containing attP and the other containing attB. 
Unless otherwise noted, the reactions were incubated for 1 h at 25 °C 
and then stopped as described for the intramolecular recombination 
assay in Ref. 15. Restriction of the recombined DNA was carried out 
with 10-50 units of endonuclease for 1 h at 37 'C. The samples were 
prepared for gel electrophoresis by addition of 5 fd of a solution 
containing 25% (w/v) FicoU, 2% (w/v) SDS, and 0.1% (w/v) brom- 
phenol blue and extracted once with about 100 m1 of a 24:1 (v/v) 
mixture of cUorofbrm and isoanvl alcohol Aguose gd eleotrqpho- 
lesis was canned out as described (15). To quantitate ncombination, 
bands were visualized by ethidium bromide fluorescence, cut from the 
gel, solubilized at 90 °C with 1 ml of 5 M sodium percMorate, and 
counted with 15 ml of Aquasol (New England Nuclear). In some 
experiments, recombination was assessed by transfer of fragments to 
nitrocellulose paper by the method of Southern and hybri^ed with 
^-labeled DNA as described (11). Aciylamide gel elecinqihoiesis 
and quantitation of the resulting DNA bands were carried out as 



Other Methods 

Topoisomerase Activity — Relaxation assays (21 n\) contained 62 
mM Tris-HCl (pH 7.5). 67 mM KCl, 5.25 mM EDTA, 3.0 mg/ml 
bovine serum albumin, 1 /ig of pPAl plasmid DNA, and 1 unit of 
Fraction HI Int as indicated. The reaction mixtures were incubated 
at 25 °C, stopped by addition of 2 ii\ ot 10% (w/v) SDS. and cGlnted 
to 6.16 ml with a solution of 50 mM Tris-HCl (pH ao) containing 25 
mM EDTA. To this was added 6.05 g of cesium cfa]or»le and ml 
of ethidium bromide (10 mg/ml) and 2.5 /d of >K;-iabeled supercoUed 
pPAl plasmid DNA. The mixture was centrifuged at 15 'C in a 
Beckman tgpe 65 rotor for 60 h at 35.000 rpm. Hie gradient was 



fractionated from the bottom and counted. To assess the extent of 
relaxation, the separation between the peak of the "C-labeled marker 
DNA and the peak of the treated DNA was divided by the separation 
between the marker DNA and a sample of pPAl plasmid DNA that 
had been completely relaxed by treatment witii HeLa cell topoisom- 
erase I {a gift of Dr. L. Liu, Johns Hopkins Umversity, Baltimore, 
MD). 

SDS Gel Electrvphoresis— The gels were slabs 16 cm X 17 cm X 
1.5 mm. The sqiarating gel contained 18% (w/v) acrylamide, 0.6% 
bisacrylamide, 1 M urea, 375 mM Tris-HCl (pH 8.8), 2 mM EDTA, 
0.1% SDS. 0.1% ammonium persulfate, and 0.66% (v/v) TBMED. 
The stackhig gel contained 5% acrylamide, 0.13% bisacrylamide, 1 M 
urea, 125 mM Tris-HCl (pH 6J8), 2 mM EDTA, 0.1% SDS, 0.1% 
ionium persulfate, and 0.08% TEMED. The running buffer was 
M ^ycine, 1 M urea, 0.025 M Tris base, and 0.1% SDS. The 
' 3 were precipitated with trichloroacetic acid and washed with 
! as described (15). They were resuspended in 28 iil of 120 mM 
Tris-HCl (pH 6.8), 2.4% SDS, 4.8 M EDTA, 24% (v/v) glycerol, 
0.007% bromphenol blue, and 2.25% /3-mercaptoethanol. The samples 
were then heated to 90 °C for 2 min and electiophoresed at 30 mA 
for 4 h. The gel was stained with 0.05% Coomassie Brilliant Blue in 
a solution of 50% meibanol and 7.5% glacial acetic acid for 2 h and 
1 in 5% methanol contaming 7ii% gladal acetic acid. 



Enzyme Purification— la order to provide a rich soiuce of 
Int-h protein, we cloned the int-h gene in a multicopy plasmid. 
As before (18), we employed a X variant, intC226, in which 
the int gene is expressed constitutively from an altered phage 
promoter (19). The structure of the hybrid plasmid is shown 
in Fig. 1. Although cells containing this plasmid grow well, 
subcultures of the original isolate occasionally contain smaller 
plasmids. These probably represent deletions of the plasmid 
shown in Fig. 1 that are created by Int-h promoted recombi- 
nation between attP and sequences on the plasmid that resem- 
ble attB. To minimize the formation of deletions, we trans- 
ferred the plasmid from its original host, a wild-type E. coli, 
to KS185, a strain carrying a deletion in the himA gene. Site- 
specific recombination is substantially reduced in this strain 
(see below) and, accordingly, we find the M>rid plasmid is 
considerably more stable. For 1^ e:g)eriment8 reported in 
this paper, Int-h was purified from K5185 containmg the 
hybrid plasmid. 

To assay Int-h activity, we measured integrative recombi- 
nation in vitro by £coRI restriction of pBP86, a plasmid 
substrate that contains both attP and attB. Two Idnds of 
reaction mixtures were used. In the standard assay, a source 
of Int-h was supplemented with a crude extract from wild- 
type E. colL Using this assay, the purification of Int-h activity 
proceeded exactly as described for vrild-type Int protein (17, 
18). The yield and specific activity of the highly purified Int- 
.h (Tabiel) ase not'Substantially different from those found 
after purification of Inf^ protein (17). In a second assay, a 
source of Int-h was supplemented with a crude extract from 
E. catt carrying a point mutation in the lumA gene. This assay 
measures a specific qualify of the Int-h protein, Le. the capac- 



FlG. 1. Structure ofa hybrid plasmid that overproduces Int- 
h. The plasmid, pHN16, consists of the cloning vector pRSF2124 and 
t £coRI fragment containing the int-h allele; only the X insert and 
mking vector sequences are shown. The position of the int gene, 
the phage attachment site attP, and target sites for jBcoRI (R) and 
Smal (S) endonucleasea aie shown. The left-pointing arrow indicates 
the transcript e^qpected to govern expression of int gene; it begins at 
the start point of the Pu promoter (made constitutive by the int- 
0226 mutation) and ends at the ti^ temunator (31, 32). m, vector 
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Table I 
Purification of Int-h protein 





Volume 




Int-h 


Specific 




ml 




X 70-= 


unils/mg 


I. Crude extract* 


75 


2.325 


2.250 


967 


II. Differential salt 


74 




1,110 




precipitation 










III. Phosphocellulose 


12 


12 


990 


S2,500 


IV. Calcium-phosphate 




6 


190 


31,666 


cellulose 











" The minimum amount of Int that produces maximal recombina- 



3n lU). 

' The yield from 75 liters of culture grown to midlog phase. 
' — , not delennined. 




abed 
Fio. 2. SDS gel electrophoresis of purified Int-h. Lane a, 1.5 
)ts of Int* protein. Lane 6, 1.5 ng of purified (Fraction IV) Int-h 
protein. Lane c, ovalbumin (subunit M, ~ 43,000). Lane d, integration 
host factor (subunit M, ~ 11,500 and 10,000). Sample preparation 
and electrophoresis were carried out as described under "Experimen- 
tal Procedures." 

ity to carry out recombination in the presence of mutant IHF. 
Wild-type Int protein shows almost no activity in this assay, 
whereas crude extracts containing Int-h produce readily de- 
tectable levels of recombination. During the purification of 
Int-h, the ratio of activities in the two assays remained 
constant (data not shown). Unlessotherwise stated* all results 
reported in this paper are from experiments that use Int-h or 
Int* purified through Step IV of Table I and Ref. 17. 

The purified protein is quite stable. We routinely add 
bovine serum albumin (2 mg/ml) to our purified proteins; 
under these conditions, Int-h activity is stable for at least 1 
year at —70 *C. As found for purified wild-type Int protein, 
fractions containing Int-h protein without added bovine 
serum albumin show diminished activity after repeated freez- 
ing and thawing. The purified Int-h protein is nearly homo- 
geneous. As shown in Fig. 2, SDS gel electrophoresis of the 
purified material shows a prominent major band that co- 
migrates with purified Int* protein at an apparent M, ~ 
40,000. On cai^l examination, some minor bands are evi- 
dent; these are similar in molecular wei^t and intensity to 
those seen in preparations of wild-type Int purified from a 
K5185 derivative. The identity in size as well as the similarity 
in purification of Int-h and Int* indicate that the int-h mu- 



Table 11 

Recombination in vitro and in vivo with Int-h and Int* 
In vitro recombination with a plasmid substrate pBP86 was carried 
out with purified Int-h or Int* supplemented by sonicates of the 
indicated E. coli strains. Reaction conditions, EcoRI restriction, and 
Southern blotting analysis were carried out as described under "Ex- 
perimental Procedures." Recombination was quantitated by compar- 
ison of the intensity of the 8.1-kb recombinant bands (see Fig. 4 for 
a detailed restriction map); serial dilutions of each reaction mixture 
were analyzed to facilitate comparison. In vivo recombination was 
determined as the fraction of recombinant progeny following infection 
of cells containing either pHN16 or pC22642 with A aUB-attP, strain 
G903. The protocol for growth and analysis of this phage is described 
in Ref. 12. . 



Source of IHF 




Relative recombination' 




In vitro 




Im-h 


Int* 


Int-h Int* 


Wild type 


1.0 


1 (50%) 


1.33 1 (60%) 


himA42 


0.33 


0.004 


0.66 0.057 


MmASZ 


0.10 


<0.001 


0.13 0.003* 


None 


0.10 


0.002 


NA' NA 



'The recombination observed with wild-type Int and IHF is as- 
signed a value of 1.0; the actual conversion of substrate to recombi- 
nant under these conditions is given in parentheses. 

'This value is at least 10-fold higher than that observed for a 
control infection that lacked a source of Int 



' NA, not applicable. 

tation does not radically alter the Int polypeptide. This hy- 
pothesis is supported by the similar sensitivity of the two 
purified proteins to inactivation. Both are readily inactivated 
either by incubation at 45 °C or by exposure to iV-ethylmal- 
eimide (data not shown). 

Recombination Promoted by lat-h— Table II compares the 
efficiency of recombination promoted by Int-h or Int* in the 
presence of cUf£erent sources of crude IHF. As noted above, 
Int-h promotes e^icient recombination when supplemented 
either with an extract of cells carrying the point mutation 
/j»mA42 or supplemented with an extract of wild-type cells. 
By contrast, Int* yields very little recombination with the 
mutant extract (line I versus 2). Int-h cannot utilize all 
mutant extracts equally well. Extracts from cells carrying a 
deletion mutation, ftwiA82, assist Int-h promoted recombi- 
nation less than one-third as well as do extracts from /iimA42 
(Table II, line 2 versus 3). In addition, extracts from cells 
carrying the point mutation h^lSl or the double mutation 
himA42 hipl51 are similar to extracts from fetmA82 cells in 
their capacity to assist Int-h promoted recombination (data 
not shown). Compared to these extracts, the enhanced recom- 
bination seen when Int-h is supplemented with extracts from 
himA42 oslls suggests that the ftimA42 mutation has not 
completely inactivated the a subunit of IHF and that Int-h is 
better able than Int* to utilize the residual activity. The 
relative capacities of Int-h and Int* to promote recombination 
with various sources of IHF are not changed by altering the 
amount of Int protein, the amount of crude IHF, or the time 
of incubation (data not shown). 

The recombination observed with extracts from cells con- 
taining the hunA82 mtitation, a deletion' of the himA. gene, 
shows that Int-h can promote recombination in the complete 
absence of a functional himA gene product This result was 
unexpected because it had been repotted earlier (9) that, in 
vivo, the int-h allele could not suppress the recombination 
defect of a strain bearing a deletion of the himA gene.* We 



^ Recent experiments show that in himA deletion strains, int-k c»n 
promote a low level of excision of X from a secondary bacterial site 
(R. Weisberg; personal communication; D. Friedman; personal com- 
munication). 
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thiuk this failure reflects the limited amount of Int protein 
made under the previous conditions since a small but readily 
detected amount of integrative recombination is observed in 
viuo after infection of himkSZ cells when Int-h is provided 
from our overproducing plasmid (Table II). 

The ability of Int-h to promote recombination with a wide 
variety of mutant IHF extracts suggests that this protein 
might have recombination activity in the total absence of 
IHF. This is confirmed in the last entry of Table II; this result 
indicates that, to the extent that our preparation is pure, Int- 
h can promote recombination by itself. The amount of IHF- 
independent recombination is not large, about 10% that seen 
when Int-h is supplemented with wild-type IHF. Note that 
this amount is similar to that seen when IHF is supplemented 
with crude extracts from a himk deletion or hip mutant strain. 
This means that other proteins found in E. coli, including 
either of the remaining wild-type subunits of IHF. cannot 
assist Int-h in promoting recombination. The capacity of Int- 
h to promote recombination by itself is also demonstrated in 
Fig. 3A. In the absence of IHF, Int-h (lane a) but not Int* 
(tone /) produces detectable recombinants. It should be 
pointed out that the mobility of the recombinant fragment 
produced in the absence of IHF is identical to that produced 
in its presence (cf. lanes a and b). This implies that the 
breakage and reunion caused by Int-h acting in the absence 
of IHF occurs at the same sites as observed in the standard 
recombination reaction. 

The remainder of Fig. 3A presents the response of Int* and 
Int-h to increasing amounts of IHF that has been purified 
from wild'type E. coU. Both Int proteins are stimulated by 
IHF. We estimate that the amount of IHF needed to maxi- 



a bode fgh 



abode fghij 

Fig. 3. Recombination promoted by varying amounts of Int 
and IHF. Supercoiled pBP86 substrate DNA was incubated in the 
presence of 50 mM KCl for 25 min as described under "Experimental 
Procedures" with proteins as indicated. In panel A, the reactions 
contained 6.0 /zg/ml of Int-h (a-c) or Int* (f-j) and different concen- 
trations of IHF: a and /, 0.0 itg/ml; 6 and g, 0.125 itg/mk c and h, 0.5 
ng/ml; d and i, 1.25 Mg/ml; e andj, 2.5 >ig/ml. In panel B, IHF was 
fixed at 2.5 iig/ml and either Int-h (a-e) or Int* if-j) was varied: a 
and f, 0.15 /i&'ml; 6 and g, 0.30 fig/ml; c and h. 0.60 ne/mU d and t, 
1.20 uSi e andy, 6.00 ng/nH. A pair of arrows indicates the position of 
the 8.1 -kb recombinant fragment. 




Fig. 4. Restriction maps for analysis of intermolecular and 
intramolecular recombination. At the top are shown two identical 
pBP86 substrate DNA circles. Attachment sites are written as POP' 
(ottP) and BOB' (attB) where O represents the 15-bp core wherein 
the recombination crossover takes place. The position of £coRI (R) 
and Pstl {P) restriction sites are marked with arrows. Distances (in 
base pairs) Ijetween adjacent restriction and/or attachment sites are 
written inside each substrate circle. Below the substrate circles ate 
given the expected fragment lengths following Ps£l or &oRI diges- 
tion. The two circular products of intramolecular recombination 
between attP and attB are drawn at the bottom left. They are shown 
separated from one another but after a typical recombination reaction 
they are linked to one another as a catenane (21). The single dimeric 
circle that arises from recombination between attP on one circle with 
attB on the other is drawn at the hwer right. Restriction sites, 
attachment sites, and fragment lengths are indicated as for the 



mally stimulate Int* is about 3-fold greater than that required 
to stimulate Int-h. This modest difference indicates that Int- 
h is not greatly altered in its afflnity for or capacity to utilize 
IHF. Fig. 3B shows the effect of adding increasing amounts 
of Int* or Int-h to a fixed, saturating amount of IHF. Similar 
amounts of the two proteins produce similar levels of recom- 
bination. This means that the Int-h mutation has not altered 
the number of Int molelcules required to carry out recombi- 
nation. No less than 35 Int-h molecules are needed per recom- 
bination event. However, as in our earlier studies with Int* 
(15), we do not know the extent to which this stoichiometry 
reflects inactive protein in our purified preparation. 

ReconAinatipn in the Absence of IHF— Becaase the occur- 
rence of X integrative recombination in the absence of IHF is 
unprecedented, we have investigated this reaction in moie 
ctetail. Optimal conditions for IHF-indepen<fent lecombina- 
tion promoted by Int*h are slightly di£ferent than those ob- 
served for recondiination mixtures in which Int-h or bit* are 
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supplemented by IHF. The optimal ionic strength is lower (25 
mM KCl rather than 70 mM), the kinetics are about 2-fold 
slower (half-maximal recombination in 30 min rather than 
10-20 min), and more Int protein is required for maximal 
recombination (about 2-fold as much). However, even under 
these optimal conditions, in the absence of IHF, Int-h does 
not recombine more than 15% of the substrate DNA. 

The DNA substrate used in Table 11 and Fig. 3 contains a 
pair of attachment sites oriented as a direct repeat. Integrative 
recombination promoted by Int-h in the absence of IHF has 
been observed with two other kinds of substrates. One sub- 
strate, pBP90, contains attF and attB on the same circle of 
DNA with the two attachment sites oriented so their core 
sequences form an inverted repeat. The other substrate was a 
pair of circles each of which carries a single attachment site, 
either attP or attB. Both substrates exhibited approximately 
the same capacity for recombination promoted by Int-h in the 
absence of IHF (measured relative to recombination in the 
presence of IHF) as was observed for pBP86 substrate (Table 
III and data not shown). Taken together, these results show 
that the capacity of Int-h to carry out recombination in the 
absence of IHF depends neither on the number of sites per 
circle nor on their orientation (see Ref. 20 for an unusual case 
of X site-specific recombination that strongly depends on 
these factors). 

Could recombination with Int-h in the absence of IHF 
reflect the action of a second protein that contaminates our 
purified Int preparation? This putative contaminant cannot 
be normal IHF because the Int-h protein was purified from a 
strain partially deleted for the himA gene. In addition, SDS 
gel electrophoresis reveals no trace of a polypeptide with the 
mobility of the remaining subunit of IHF (Fig. 2), even when 
the gel is greatly overloaded (data not shown). Since any 
putative contaminant would have to be present in very small 
amounts relative to Int, it would have to function catalytically, 
a mode of action not observed with IHF. We attempted to 
test for such an activity in the following way. We inactivated 
Int-h protein by treatment with iV-ethylmaleimide and as- 
sayed for residual IHF or some other iV-ethylmaleimide- 
resistant component that could either activate purified Inf^ 
protein or stimulate purified Int-h protein; no such compo- 
nent was found. This negative result does not rule out the 




a b c 



Fig. 5. Restriction a&al}^ of intermolecular versus intra- 
molecular recombination in the presence or absence of IHF. 

Recombination of supercoiled pBP86 substrate was carried put with 
the following concentrations of IHF and Int: lane a, 0.0 Aig/ml IHF 
and 0.0 fis/val Int; lane b, 0.0 fig/ml IHF and 6.0 /tg/ml Int-h; lane c, 
0.83 ^g/mi IHF and 6.0 itg/nsl Int-h. Reactions were carried out in 
the presence of 25 mM KCl for 60 rniii at 25 °C, treated with PstI 
restriction endonuclease, and electrophoresed as described under 
"Experimental Procedures." The position of the substrate fragments 
is indicated at the left and the positions of the fragments diagnostic 
for intermolecular recombination (6.5 kb) and total recombination 
(3.7 kb) aie indicated at the r4?M. 



existence of a helping factor in our preparations of Int-h, but 
the simplest interpretation of our results is that Int-h protein 
has an intrinsic capacity to carry out all the steps required 
for integrative recombination. 

Although many features of IHF-independent recombina- 
tion sponsored by Int-h are similar to those of standard 
integrative recombination, there is one surprising difference. 
IHF-independent recombination preferentially recombines 
attachment sites on different molecules as opposed to sites 
that are situated on the same molecule. The reverse is true 
for integrative recombination that is promoted by either Int* 
or Int-h when supplemented by IHF. The bias favoring inter- 
molecular versus intramolecular recombination is revealed by 
restriction analysis. Fig. 4 shows restriction maps for pBP86, 
a substrate with directly repeated attachment sites; recombi- 
nation between attP and attB leads to the appearance of three 
kinds of new bands after digestion with restriction endonu- 
clease PstI: those that arise strictly from intramolecular re- 
combination (1.4-kb circle), those that come strictly from 
intermolecular recombination (6.5-kb linear fragment), and 
those that are produced by both pathways (3.7-kb linear 
fragment).^ Fig. 5 shows that when recombination reactions 
carried out in the presence or absence of IHF are analyzed by 
restriction with Psfl nuclease, the two linear recombinant 
fragments are produced in different relative yields. In reac- 
tions carried out in the presence of IHF [lane c), the 6.5-kb 
fragment is much less prominent than the 3.7-kb fragment, 
indicating that intermolecular recombination plays only a 
small role and that intramolecular is the favored reaction. 
Intramolecular recombination is also the dominant pathway 
in reaction mixtures with wild-t3rpe Int and IHF (data not 
shown), supporting earlier conclusions (11, 21). In contrast, 
in reactions carried out in the absence of IHF (lane b), the 
6.5-kb fragment and 3.7-kb fragment are of similar intensity, 
indicating that intermolecular recombination is a major path- 
way. When the amoimt of DNA in each band is quantitated 
(Table III), it appears that essentially all of the recombinant 
product comes from the intermolecular pathway in the ab- 
sence of IHF and that less than 5% of the recombinant 
product comes from this route in the presence of IHF. This 
analysis is supported by the appearance of a substantial 
amount of 1.4-kb circular species in reactions carried out in 
the presence of IHF and the virtual absence of such species 
in reactions carried out in the absence of IHF (data not 
shown). The same kind of ana^rsis of intermolecubv versus 
intramolecular recombination has been done with two other 
substrates: pBPl, that is similar to pBP86 but has a different 
separation between attachament sites (6.8 uersus 1,4 kb) and 
pBP90, that haaottP and attB oriented as an inverted repeat. 
Equivalent results were obtained; in the absence of IHF, 
recombination promoted by Int-h was predominantly inter- 
molecular. 

The conclusion that, in the absence of IHF, Int-h mainly 
promotes intermolecular recombination is confurmed by an 
analysis of the unrestricted products of reaction mixtures. To 
avoid complications of supercoiling, tiie products o£ recombi- 
nation of pBP86 substrate were nicked wifli pancreatic DNase 
and electrophoresed on a hi^ resolution gel system as first 
described by Sundin and Vaishavsky (22). Fig. 6, bmea b and 
c, show tiiat recombination in the presence of IHF )delds a 



' Note that restriction with endonuclease EcoRl yields only frag- 
ments that are of the third kind. Thus, a bias in intermolecular versus 
intramolecular recombination would not have been detected in the 
standard assay used to pniify Int-h. Note also that recombination 
between attP one moldcule with a»P on another does not change the 
restriction pattern and is not scored in these assays. 
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Table III 

Quantitatioe comparison of recombination pathways 
Recombination was carried out as described in the legend to Fig. 5 
and analyzed by treatment with endonuclease Pstl and agarose gel 
electrophoresis, as described under "Experimental Procedures." The 
substrat* pBP86 and its recombinant products are described in Fig. 
4. Recombination between DNA molecules that contain only a single 
attachment site was carried out with supercoiled plasmid pPAl (con- 
taining aKP) and £coRI-linearized plasmid pBBl05 (containing 
atfB). Treatment of these reaction mixtures with endonuclease Pstl 
yields substrate fragments of 4.2 and 4.9 kb (plus smaller fragments) 
and a recombinant fragment of 7.2 kb (plus smaller fragments), wt, 
wild type protein; h, Int-h protein. 





IHF 


Int 


Recombination 
Total Intermolecular 








% 


% total 
recombination 


pBP86 


+ 


wt 


66.0° 


2.0* 


pBP86' 




h 




4.7' 


pBP86' 




h 


4.6" 


113.0* 


pPAl X pBBlOS 


+ 






(100)' 


pPAl X pPBlOS' 


+ 


h 


25.3'' 


(100)' 


pPAl X pBBl05' 




h 




(100)' 



- The cpm in the 3.7-kb fragment was divided by the cpm in the 
4.4 plus 5.1-kb fragments from an unrecombined mixture. This ratio 
was multiplied by the factor 2.57 to correct for the difference in 
fragment sizes; the result is expressed as a percentage. 

'The cpm in the 6.5-kb fragment was divided by the cpm in the 
4.4 plus 5.1 kb fragments from an unrecombined mixture. This ratio 
was multiplied by the factor 1.45 to correct for the difference in 
fragment sizes. This value, the fraction of substrate recombining 'wa 
the intermolecular pathway, was divided by the total recombinant 
fraction to give the proportion of recombinants that use the inter- 
molecular pathway. This result is expressed as a percentage. 

'Average of four experiments. 

'The cpm in the 7.2-kb fragment was divided by the cpm in the 
4.2 plus 4.9 kb fragments from an unrecombined mixture. This ratio 
was multiplied by 1.265 to correct for the difference in fragment sizes. 
The result is expressed as a percenwge. 

• For these substrates, intermolecular recombination is the only 
pathway possible. 

'Average oS three experiments. 

ladder of bands. Each band in the ladder contains catenanes 
between the two circular products of intramolecular recom- 
bination (21), each step of the ladder representing catenanes 
with a different number of interlocks between the two circles 
(22). In contrast, Fig. 6d shows that in the absence of IHF, 
recombination promoted by Int-h yields mostly dimeric cir- 
cles. IsoMion of the dimer band and subsequent restriction 
with endonuclease EcdBl confirmed that these circles are the 
result of intermolecular recombination between an attP on 
one substrate molecule and an ottB on. second, molecule 
(data not shown). 

Recombiruxtion of Nonsupercoiled DNA — The int-k muta- 
tion has been reported to increase X site-specific recombina- 
tion during infection of strains carrying a mutation in gyrB, 
the gene encoding the subunit of DNA gyrase (9). This 
suggests that Int-h protein might be more active than Int* on 
DNA substrates that have reduced levels of supercoiling. 
Fig. 7 shows a comparison of recombinadon with supercoiled 
and nonsupercoiled substrates. The assays were carried out 
in the presence of IHF at low ionic strength, a condition that 
is optimal for Int*-promoted recombination of nonsupercoiled 
substrates. As previously reported {23, 24), even under these 
conditions supercoiled DNA is the better substrate for wild- 
type Int protein, recombining abdut 10 times as £ast as non- 
supercoiled DNA (Fig. 7a). Int-h protein is similar to Inf^ in 
the speed with which it promotes recombination of stqiercoiled 
DNA substrates (Fig. 76). However Int-h promotes recombi- 




abed 



Fig. 6. Topological analysis of intermolecular versus intra- 
molecular recombination. Supercoiled plasmid pBP86 was recom- 
bined as described in the legend to Fig. 5 with the following concen- 
trations of IHF and Int: lane a, 0.0 ng/m\ IHF and 0.0 ng/ml Int; lane 
b, 0.83 Mg/ml IHF and 4.0 >jg/ml Int*; lane c, 0.50 ^g/ml IHF and 1.5 
ne/ml Int-h; lane d, 0.0 jig/ml IHF and 6.0 Mg/ml Int-h. After 60 min 
at 25 'C, the reaction was adjusted to 100 mM KCl, 10 mm MgClj 
and digested for 10 min at 37 'C with sufficient pancreatic DNase to 
introduce several nicks per substrate. The samples were then electro- 
phoresed as described (22). The position of the following monomeric 
(9.4 kb) and dimeric species are indicated: 1, linear monomer; 2, 
nicked monomer circle; 3, linear dimer; 4, nicked dimer circle. The 
linear species arise from excessive digestion with pancreatic DNase. 




Fig. 7. Recombination of supercoiled and nonsupercoiled 
substrates. The substrate was a mixture of two DNAs, the super- 
coiled form of pBP86 and the nicked circle form of pBP86dl, a 
dfetetibii derivative of pBP86 (constructed in this laboratory by T. 
Pollock) that has lost 600 base pairs from the 1.4-kb segment that 
separates attP and attB. An aliquot of this substrate (0.5 ^g) was 
incubated as described in the legend to Fig. 5 with 0.5 iig/ml IHF and 
either 4.0 ftg/nil Int* {panel a) or 3.0 /xg/ml Int-h (panel b). After 
various times, aliquots were removed and stopped by heating to 65 "C, 
digested with restriction endonuclease BamHl, and electrophoresed 
through polyactylamide. The radioactivity in a fragment diagnostic 
of recomt»nation of the SDQierooiled substrate (1.4 kb) or nonsuper- 
coiled substrate (0.8 kb) was determined as described under "Exper- 
imental Piocedures." Each value was divided by the radioactivity in 
a Psd/BamUl fragment characteristic of the appi«q>riate sufasteate 
and the ratio was adjusted to yield the extent of recominnation as in 
Table IIL The average of four and six experiments is plotted in a and 
b, respectively. , supercoiled substrate; , nonsupercoiled sub- 
strate. 

nation of nonsupercoiled DNA with an initial velocity about 
3 times fiaster than that observed with Int*. We conclude from 
these experiments that,, when supplemented by IHF, Int-h is 
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superior to Int* in promoting recombination of nonsuper- 
coiled DNA. 

The results just presented show that supercoiling imparts 
a modest benefit to recombination promoted by Int-h. We 
next asked whether this benefit requires the presence of IHF. 
Supercoiled and nonsupercoiled DNA were compared as sub- 
strates for Int-h promoted recombination in the absence of 
IHF. Inspecting gels like those used to generate Fig. 7 indi- 
cated that the efficiency and kinetics of recombination were 
similar for both substrates (data not shown). This impression 
was confirmed by an experiment in which the extent of 
recombination was quantitated. After 20 min of incubation, 
supercoiled and nonsupercoiled substrates both had under- 
gone 3.5% recombination; after 90 min, the yield of recombi- 
nants from both substrates was 14.0%. It appears that Int-h 
protein, by itself, does not distinguish between supercoiled 
and nonsupercoiled DNA substrates. We have considered one 
trivial explanation for this lack of discrimination. It has been 
shown earlier that Int protein contains a topoLsomerase activ- 
ity that relaxes supercoiled DNA (4). This activity is intrins- 
ically weak relative to the recombination activity of Int and 
is partially inhibited by IHF (25). If Int-h protein displayed a 
much stronger topoisomerase activity, supercoiled DNA 
might be relaxed in reaction mbctures before recombination 
could occur. However, this hypothesis is not supported by a 
comparison of the relaxing activities of Int* and Int-h proteins 
(Fig. 8). Moreover, no major difference is seen in the relaxa- 
tion of the substrate DNA when similar amounts of Int* and 
Int-h are used in recombination reactions (data not shown). 
We conclude that the failure of Int-h to recombine supercoiled 
and nonsupercoiled substrates with different efficiency means 
that IHF is an essential part of the mechanism that senses 
the superhelicity of the recombination substrate. 

DISCUSSION 

Purified Int-h protein carries out substantial amounts of 
integrative recombination in the apparent absence of IHF. 
Thus, one must conclude that Int protein can manifest all the 
activities required for the elementary steps of recombination 
between specific sites. As mentioned in the Introduction, 
earlier observations indicated that wild-type Int carries the 
catalytic center responsible for breakage and reunion. Our 
studies with Int-h confirm these conclusions and demonstrate, 
for the first time, that this protein can also specify the steps 
required for synapsis. Unless the int-h allele creates new 




Fig. 8. Relaxine activity of Int* and Int-b. Radioactively la- 
beled supercoiled plasmid pPAl (DNA) was incubated as described 
under "Experimental Procedures" with identical amounts of Int* 

( ) or Int-h ( ). The reactions were analyzed by cesium chlo- 

ride-ethi(Uum bromide centrifugation as described under "Experimen- 
tal Procedures." The degree of relaxation relative to a fully relaxed 
plasmid is plotted as a itmction of the length of tiie incubation. 



properties rather than enhancing properties inherent in the 
parent gene, we must conclude that wild-type Int also has the 
capacity to carry out synapsis as well as recognition and 
strand exchange. Indeed, when sensitive methods are used to 
probe for recombinant products, wild-type Int does reveal 
some ability to carry out recombination in the absence of 
IHF, albeit at a level 50-fold lower than that observed with 
Int-h acting alone and 500-fold lower than observed in the 
presence of IHF (Table II). By implication, IHF must play an 
accessory role in recombination, enhancing the capacity of 
Int to carry out one or more of the critical steps in recombi- 
nation. 

What mechanism underlies the enhanced capacity of Int-h 
to cany out recombination in strains mutant for IHF? Our 
studies have ruled out three kinds of explanation. First, we 
have shown that the int-h allele does not simply lead to the 
overproduction of an essentially wild-type protein. The 
amount of Int protein, we recover from himA cells expressing 
Int* or Int-h alleles is virtually identical, confirming and 
extending studies on the rate of synthesis of Int* and Int-h 
polypeptides (9). A second possible explanation for the en- 
hanced activity of the int-h allele is that it specifies an Int 
protein with an enhanced capacity for enzymatic turnover. 
Previous studies had demonstrated that many molecules of 
purified Int* protein are required to produce a single recom- 
binant (15, 26, 27). This implies that Int promotes recombi- 
nation by a stoichiometcic rather than catalytic action on 
attachment sites. If a molecule of Int-h protein were able to 
participate in more than one recombination event, an intrins- 
ically weak capacity to carry out recombination in the absence 
of IHF would be magnified. However, the same value for the 
amount of protein required to produce a recombinant is 
observed with our purified preparation of Int-h protein as is 
seen with purified Int*. Therefore, an increased capacity for 
turnover is not the basis for the Int-h phenotype. A third 
hypothesis for the behavior of Int-h invokes preferential 
interaction between Int-h and IHF, permitting Int-h to utilize 
IHF whose quality or quantity is altered by mutation. This 
hypothesis, which was our favorite at the outset of this work, 
is ruled out as the sole explanation for the behavior of Int-h 
by our observation that Int-h carries out recombination in 
the complete absence of IHF. Titration curves show about a 
3-foId difference in the levels of IHF required to stimulate 
recombination by Int-h and Int*. This small effect and the 
enhanced recombination in himM2 (as opposed to other IHF 
mutants) indicate that Int-h may have a preferred interaction 
with IHF but this putative alteration cannot be a major factor 
in explaining the Int-h phenotype. 

Many posrabilities remain as plausible explanations for the 
behavior of Int-h protein. These can be organized around the 
concept that recombination can be divided into st«ps of rec- 
ognition, synapsis, and strand exchange. Alteration m recog- 
nition of attachment sites by Int-h is a very straightforward 
hypothesis. For example, Int-h might have a higher affinity 
for the core region of attF than does Int*. Footprinting studies 
have shown that IHF binds to segments of attP that Sank the 
region of the core that is occupied by Int (reviewed in Ref. 1). 
Moreover, similar studies have revealed that IHF enhances 
the binding of Int* to the core region.^ Thus, it is attractive 
to imagine that Int-h can fiinction in the absence of IHF 
because it binds more tigfatiy to the core than does Int*. This 
argument can easily be extended to esplain the enhanced 
ci^acity of Int-h to promote iategiation at secondary bacterial 
attachment sites (9). In addition to models invoUng altered 

* N. Craig and H. Nash, manuscript in preparation. 
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recognition of attachment sites, equally attractive hypotheses 
can be constructed concerning alterations in the capacity of 
Int-h to carry out the later steps of recombination. For this 
purpose, it is best to consider a formal scheme for synapsis 
and strand exchange embodied by the equation: 

attP' + attB* (attP*-attB*) attL + attiR 

In this scheme, attP* and attB* represent attachment sites 
loaded with recombination proteins, (attP*-attB*) represents 
the synaptic intermediate in which the two sites are juxta- 
posed with the 15-base pair cores aligned, and attL + attR 
represent the products of strand exchange. Since synapsis is 
postulated to be a reversible process, the efficiency of recom- 
bination will be enhanced by changes that either stabilize the 
synaptic intermediate or accelerate its conversion to a recom- 
binant product. If the alteration in Int-h changes either of 
these characteristics, it would be easy to understand how this 
protein might be better able to utilize what little synaptic 
intermediate might form under restrictive conditions such as 
the absence of IHF. Although our experiments have not yet 
defined which of these mechanisms is responsible for the 
altered recombination activity of Int-h, we feel the present 
work opens the way to a rational investigation of this problem. 

Our most surprising finding is that, when acting alone, Int- 
h preferentially recombines two attachment sites that are 
located on separate circles rather than two sites that are 
situated on the same circle. That is to say, in the absence of 
IHF, Int-h promotes intermolecular rather than intramolec- 
ular recombination. Precisely the opposite is true when Int-h 
carries out recombination in the presence of IHF. This same 
bias, intramolecular recombination over intermolecular re- 
combination, has been observed both in vivo (12) and in vitro 
(21) for wild-type Int protein in the presence of IHF. A 
preference for intramolecular recombination is readily under- 
standable since the effective concentration of attachment site 
pairs should be high when the sites are tethered to each other 
by a length of flexible DNA (28, 29). Thus, the piefetence for 
intermolecular recombination shown by Int-h in the absence 
of IHF is surprising. This bias has been observed with two 
different substrates: one containing directly repeated attach- 
ment site find one containing an inversely repeated pair of 
sites. We think, therefore, that the phenomenon is intrinsic 
to IHF-independent recombination and we have considered 
two kinds of explanation for our observations. The first model 
states that, in the absence of IHF, intramolecular recombi- 
nation is suppressed while intermolecular recombination pro- 
ceeds at its normal rate. This could come about as a result of 
Int-h binding to substrate DNA. It is known that, in addition 
to specific binding to attachment sites, Int protein, has a 
substantial nonspecific affmity for DNA. Footprinting (6) 
and electron microscopic (10) studies have shown that, at 
high ratios of Int to DNA, long stretches of DNA become 
covered with protein. If this were to happen to a recombina- 
tion substrate, the DNA between attachment sites might be 
made sufficiently stiff so that the capacity of two sites on the 
same circle to becomed juxtaposed would decrease. A second 
class of models to explain the bias toward intermolecular 
recombuiation has been su^ested by the observation that Int 
can aggregate DNA.' If the local concentration of DNA mol- 
ecules is raised sufGcientbr by aggregation, statistical theoiy 
implies that random intermolecular collisions between attadi- 
ment sites will predominate over intramolecular events (28, 
29). An interestiitg example of this kind of phenomenon has 
been recently observed for the joining of cohesive ends by 



DNA ligase 'm the presence or absence of volume excluders 
like polyethylene glycol In the absence of polyethylene glycol, 
intramolecular ligation to form circles is the predominant 
mode but when the effective concentration of DNA is raised 
by the addition of polyethylene glycol, circles are not formed 
and linear multimers accumulate (30). We imagine that, in 
the absence of IHF, Int-h may promote aggregation of sub- 
strate DNA either because of nonspecific charge-shielding 
effects or because of interactions between Int proteins bound 
to the DNA. Regardless of which proposed mechanism is 
responsible for the preference for intermolecular recombina- 
tion, it should be emphasized that IHF reverses this bias. 
This means that IHF not only stimulates recombination 
promoted by Int>h and Int* but also changes the capacity of 
Int to bind nonspecifically to DNA and/or to form DNA 
aggregates. In this context, it is interesting to note that IHF 
prevents the formation of DNA aggregates by Int.* 

IHF may not be the only E. coli protein that can reverse 
the intermolecular recombination bias of Int-h. Substantial 
amounts of intramolecular recombination are observed when 
Int-h is supplemented with crude extracts from cells carrying 
a deletion in the himA gene.* It may be that many basic 
proteins will share with IHF the capacity to interfere with 
non-specific binding to DNA or aggregation of DNA by Int- 
h. We are led to speculate that the action of many DNA 
binding proteins that, like Int, have a significant nonspecific 
binding affinity and a tendency to aggregate is modulated by 
the kind of interaction with accessory proteins that we have 
uncovered in this work. 
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CIRCULAR DNA EXPRESSION CASSETTES Inasmuch as the therapeutic genes used in gene therapy 

FOR IN VIVO GENE TRANSFER treatments can code, for example, for a lymphokine, a 
growth factor, an anti-oncogene, or a protein whose function 

Gene therapy consists in correcting a deficiency or an is lacking in the host and hence enables a genetic defect to 

abnormality by introducing genetic information into the 5 be corrected, the dissemination of some of these genes could 

affected cell or organ. This information may be introduced have unforeseeable and worrying effects (for example if a 

either in vitro into a cell extracted from the organ and then pathogenic bacterium were to acquire the gene for a human 

reinjected into the body, or in vivo, directly into the tissue growth factor). Furthermore, the plasmids used in non-viral 

concerned. Being a high molecular weight, negatively gene therapy also possess a marker for resistance to an 

charged molecule, DNA has difficulties in passing sponta- 10 antibiotic (ampiciUin, kanamycin, etc.). Hence the bacte- 

neously through the phospholipid cell membranes. Different rium acquiring such a plasmid has an undeniable selective 

vectors are hence used in order to permit gene transfer: viral advantage, since any therapeutic antibiotic treatment using 

vectors on the one hand, natural or synthetic, chemical an antibiotic of the same family as the one selecting the 

and/or biochemical vectors on the other hand. Viral vectors resistance gene of the plasmid will lead to the selection of 

(retroviruses, adenoviruses, adeno-associated viruses, etc.) 15 the plasmid in question. In this connection, ampicillin 

are very effective, in particular in passing through belongs to the p-lactams, which is the family of antibiotics 

membranes, but present a number of risks, such as most widely used in the world. It is hence necessary to seek 

pathogenicity, recombination, replication, immunogenicity, to limit as far as possible the dissemination of the therapeutic 

etc. Chemical and/or biochemical vectors enable these risks genes and the resistance genes. Moreover, the genes carried 

to be avoided (for reviews, see Behr, 1993, Gotten and 20 by the plasmid, corresponding to the vector portion of the 

Wagner, 1993). These vectors are, for example, cations plasmid (function(s) required for replication, resistance 

(calcium-phosphate, DEAE-dextran, etc.) which act by gene), also run the risk of being expressed in the transfected 

forming precipitates with DNA, which precipitates can be cells. There is, in effect, a transcription background, which 

"phagocytosed" by the cells. They can also be liposomes in cannot be ruled out, due to the host's expression signals on 

which DNA is incorporated and which fuse with the plasma 25 the plasmid. This expression of exogenous proteins may be 

membrane. Synthetic gene transfer vectors are generally thoroughly detrimental in a number of gene therapy 

lipids or cationic polymers which complex DNA and form a treatments, as a result of their potential immunogenicity and 

particle therewith carrying positive surface charges. These hence of the attack of the transfected cells by the immune 

particles are capable of interacting with the negative charges system. 

of the cell membrane and then of crossing the latter. Dio- 30 Hence it is especially important to be able to have at 

ctadecylamidoglycylspermine (DOGS, Transfectam™) or one's disposal medicinal DNA molecules having a genetic 

N-[l-(2,3-dioleyloxy)propyl]-N,N,N-trimethylammonium purity suitable for therapeutic use. It is also especially 

chloride (DOTMA, Lipofectin''") may be mentioned as important to have at one's disposal methods enabhng these 

examples of such vectors. Chimeric proteins have also been DNA molecules to be prepared in amounts appropriate for 

developed: they consist of a polycationic portion which 35 pharmaceutical use. The present invention provides a solu- 

condenses DNA, linked to a ligand which binds to a mem- tion to these problems. 

brane receptor and carries the complex into the cells by The present invention describes, in effect, DNA mol- 

endocytosis. It is thus theoretically possible to "target" a ecules which can be used in gene therapy, having greatly 

tissue or certain cell populations so as to improve the in vivo improved genetic purity and impressive properties of bio- 

bioavailability of the transferred gene. 40 availability. The invention also describes an especially effec- 

However, the use of chemical and/or biochemical vectors tive method for the preparation of these molecules and for 

or of naked DNA imphes the possibility of producing large their purification. 

amounts of DNA of pharmacological purity. In effect, in The present invention lies, in particular, in the develop- 

these gene therapy techniques, the medicinal product con- ment of DNA molecules which can be used in gene therapy, 

sists of the DNA itself, and it is essential to be able to 45 virtually lacking any non-therapeutic region. The DNA 

manufacture, in appropriate amounts, DNAs having suitable molecules according to the invention, also designated 

properties for therapeutic use in man. minicircles on account of their circular structure, their small 

The plasmids currently used in gene therapy carry (i) an size and their supercoiled form, display many advantages, 

origin of replication, (ii) a marker gene such as a gene for They make it possible, in the first place, to eliminate the 

resistance to an antibiotic (kanamycin, ampicillin, etc.) and 50 risks associated with dissemination of the plasmid, such as 

(iii) one or more transgenes with sequences required for (1) replication and dissemination which may lead to an 

their expression (enhancer(s), promoter(s), polyadenylation uncontrolled overexpression of the therapeutic gene, (2) the 

sequences, etc.). These plasmids currently used in gene dissemination and expression of resistance genes, and (3) 

therapy (in clinical trials such as the treatment of the expression of genes present in the non-therapeutic por- 

melanomas, Nabel et al., 1992, or in experimental studies) 55 tion of the plasmid, which are potentially immunogenic 

display, however, some drawbacks associated, in particular, and/or inflammatory, and the like. The genetic information 

with their dissemination in the body. Thus, as a result of this contained in the DNA molecules according to the invention 

dissemination, a competent bacterium present in the body is limited, in effect, essentially to the therapeutic gene(s) and 

can, at a low frequency, receive this plasmid. The chance of to the signals for regulation of its/their expression (neither 

this occurring is all the greater for the fact that the treatment 60 origin of replication, nor gene for resistance to an antibiotic, 

in question entails in vivo gene therapy in which the DNA and the like). The probability of these molecules (and hence 

may be disseminated in the patient's body and may come of the genetic information they contain) being transferred to 

into contact with bacteria which infect this patient or alter- a microorganism and being stably maintained is almost zero, 

natively with bacteria of the commensal flora. If the bacte- Furthermore, due to their small size, DNA molecules 

rium which is a recipient of the plasmid is an enterobacte- 65 according to the invention potentially have better bioavail- 

rium such as E. coli, this plasmid may replicate. Such an ability in vivo. In particular, they display improved capaci- 

event then leads to the dissemination of the therapeutic gene. ties for cell penetration and cellular distribution. Thus, it is 
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d that the coefficient of diffusion in the tissues is recombination, positioned in the direct c 
inversely proportional to the molecular weight (Jain, 1987). position in the direct orientation indicates that the two 
Similarly, at cellular level, high molecular weight molecules sequences follow the same 5'-3' polarity in the recombinant 
have inferior permeability through the plasma membrane. In DNA according to the invention. The genetic constructions 
addition, for the plasmid to progress to the nucleus, which is 5 of the invention can be double-stranded DNA fragments 
essential for its expression, high molecular weight is also a (cassettes) essentially composed of the elements mentioned 
drawback, the nuclear pores imposing a size Hmil for ^bove. These cassettes can be used for the construction of 
diffusion to the nucleus (Landford et al., 1986). The elimi- '^e" ^"^^^ h^^ng 'h^se elements integrated in their genome 
nation of the non-therapeutic portions of the plasmid (origin (FIG. 1). The genetic constructions of the invention can also 
of replication and resistance gene in particular) according to lO be plasmids, that is to say any linear or circular DNA 
the invention also enables the size of the DNA molecules to molecule capable of replicating in a given host cell, con- 
be decreased. This decrease may be estimated at a factor of '^mmg the gene or genes of mterest flanked by the two 
2, reckoning, for example, 3 kb for the origin of replication sequences permittmg site-specific recombination, positioned 
and the resistance marker (vector portion) and 3 kb for the ™ *he direct orientation. The construction can be, more 
transgene with the sequences required for its expression. 15 specifically, a vector (such as a cloning and/or expression 
This decrease (i) in molecular weight and (ii) in negative vector), a phage, a virus, and the like. These plasmids of the 
charge endows the molecules of the invention with invention may be used to transform any competent cell host 
improved capacities for tissue, ceUular and nuclear diflEusion pwposc of the production of minicircles by replica- 
and bioavailability. plasmid followed by excision of the minicircle 

Hence a first subject of the invention lies in a double- 20 (^'^- 2)- 

stranded DNA molecule having the following features: it is 1° connection, another subject of the invention lies in 

circular in shape and essentially comprises one or more » recombinant DNA composing one or more genes of 

genes of interest. As stated above, the molecules of the interest, flanked by two sequences permittmg site-speaflc 

invention essentially lack non-therapeutic regions, and espe- recombination, positioned in the direct orientation. 

ciaUy an origin of replication and/or a marker gene. In 25 Th« recombinant DNA according to the invention is 

addition, they are advantageously in supercoiled form. preferably a plasmid comprising at least: 

The present invention is also the outcome for the devel- a) an origin of replication and optionally a marker gene, 

opment of a method, of constructions and of cell hosts which b) two sequences permitting a site-specific recombination, 

are specific and especially effective for the production of positioned in the direct orientation, and, 

these therapeutic DNA molecules. More especially, the 30 c) placed between said sequences b), one or more genes 

method according to the invention lies in the production of of interest. 

therapeutic DNA molecules defined above, by excision from The specific recombination system present in the genetic 

a plasmid or from a chromosome by site-specific recombi- constructions according to the invention can be of different 

nation. The method according to the invention is especially origins. In particular, the specific sequences and the recom- 

advantageous, since it does not necessitate a prior step of 35 binases used can belong to different structural classes, and in 

purification of the plasmid, is very specific, especially particular to the integrase family of bacteriophage X or to the 

effective, does not decrease the amounts of DNA produced resolvase family of the transposon Tn3. 

and leads directly to therapeutic molecules of very great Among recombinases belonging to the integrase family of 

genetic purity and of great bioavailability. This method bacteriophage \, there may be mentioned, in particular, the 

leads, in effect, to the generation of circular DNA molecules 40 integrase of the phages lambda (Landy et al.. Science 197 

(minicircles) essentially containing the gene of interest and (1977) 1147), P22 and <I>80 (Leong et al., J. Biol. Chem. 260 

flie regulator sequences permitting its expression in the cells, (1985) 4468), HPl of Haemophilus influenza (Hauser et al., 

tissue, organ or apparatus, or even the whole body, in which J. Biol. Chem. 267 (1992) 6859), the Cre integrase of phage 

the expression is desired. In addition, these molecules may PI, the integrase of the plasmid pSAM2 (EP 350,341) or 

then be purified by standard techniques. 45 alternatively the FLP recombinase of the 2/i plasmid. When 

The site-specific recombination may be carried out by the DNA molecules according to the invention are prepared 

means of various systems which lead to site-specific recom- by recombination by means of a site-specific system of the 

bination between sequences. More preferably, the site- integrase family of bacteriophage lambda, the DNA mol- 

specific recombination in the method of the invention is ecules according to the invention generally comprise, in 

obtained by means of two specific sequences which are so addition, a sequence resulting from the recombination 

capable of recombining with one another in the presence of between two att attachment sequences of the corresponding 

a specific protein, generally designated recombinase. For bacteriophage or plasmid. 

this reason, the DNA molecules according to the invention Among recombinases belonging to the family of the 

generally comprise, in addition, a sequence resulting firom transposon Tn3, there may be mentioned, in particular, the 

this site-specific recombination. The sequences permitting 55 resolvase of the transposon Tn3 or of the transposons Tn21 

the recombination used in the context of the invention and Tn522 (Stark et al., 1992); the Gin invertase of bacte- 

generally comprise from 5 to 100 base pairs, and more riophage mu or alternatively the resolvase of plasmids, such 

preferably fewer than 50 base pairs. as that of the par fragment of RP4 (Albert et al, Mol. 

The site-specific recombination may be carried out in Microbiol. 12 (1994) 131). When the DNA molecules 

vivo (that is to say in the host cell) or in vitro (that is to say 60 according to the invention are prepared by recombination by 

on a plasmid preparation). means of a site-specific system of the family of the trans- 

In this connection, the present invention also provides poson Tn3, the DNA molecules according to the invention 

particular genetic constructions suitable for the production generally comprise, in addition, a sequence resulting from 

of the therapeutic DNA molecules defined above. These the recombination between two recognition sequences of the 

genetic constructions, or recombinant DNAs, according to 65 resolvase of the transposon in question, 

the invention comprise, in particular, the gene or genes of According to a particular embodiment, in the genetic 

interest flanked by the two sequences permitting site-specific constructions of the present invention, the sequences per- 
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mitting site-specific recombination are derived from a bac- 
teriophage. More preferably, these latter are attachment 
sequences (attP and attB sequences) of a bacteriophage, or 
derived sequences. These sequences are capable of recom- 
bining specifically with one another in the presence of a 
recombinase designated integrase. The term derived 
sequence includes the sequences obtained by 
modification(s) of the attachment sequences of the 
bacteriophages, which retain the capacity to recombine 
specifically in the presence of the appropriate recombinase. 
Thus, such sequences can be reduced fragments of these 
sequences or, on the contrary, fragments extended by the 
addition of other sequences (restriction sites, and the like). 
They can also be variants obtained by mutation(s), in 
particular by point mutation(s). The terms attP and attB 
sequences of a bacteriophage or of a plasmid denote, accord- 
ing to the invention, the sequences of the recombination 
system specific to said bacteriophage or plasmid, that is to 
say the attP sequence present in said phage or plasmid and 
the corresponding chromosomal attB sequence. 20 

By way of preferred examples, there may be mentioned, 
in particular, the attachment sequences of the phages 
lambda, P22, O80, PI and HPl of Haemophilus influenzae 
or alternatively of plasmid pSAM2 or the l/x plasmid. These 
sequences are advantageously chosen from all or part of the 25 
sequences SEQ ID No. 1, SEQ ID No. 2, SEQ ID No. 6, 
SEQ ID No. 7, SEQ ID No. 8, SEQ ID No. 9, SEQ ID No. 
10, SEQ ID No. 11, SEQ ID No. 12, SEQ ID No. 13 and 
SEQ ID No. 14. These sequences comprise, in particular, the 
central region homologous to the attachment sequences of 30 
these phages. 

In this connection, a preferred plasmid according to the 
present invention comprises 

(a) a bacterial origin of replication and optionally a 
marker gene, 35 

(b) the attP and attB sequences of a bacteriophage selected 
' from the phages lambda, P22, OSO, HPl and PI or of 

plasmid pSAM2 or the 2fi plasmid, or derived 
sequences; and, 

(c) placed between said sequences b), one or more genes 40 
of interest. 

According to an especially preferred embodiment, the 
sequences in question are the attachment sequences (attP and 
attB) of phage lambda. Plasmids carrying these sequences 
are, in particular, the plasmids pXL2648, pXL2649 or 45 
pXL2650. When these plasmids are brought, in vivo or in 
vitro, into contact with the integrase of phage lambda, the 
sequences recombine with one another to generate in vivo or 
in vitro, by excision, a minicircle according to the invention 
essentially comprising the elements (c), that is to say the 50 
therapeutic portion (FIG. 2). 

Still according to a particular embodiment of the 
invention, the sequences permitting site-specific recombina- 
tion are derived from the loxP region of phage PI. This 
region is composed essentially of two inverted repeat 55 
sequences capable of recombining specifically with one 
another in the presence of a protein, designated Cre 
(Sternberg et al., J. Mol. Biol. 150 (1971) 467). In a 
particular variant, the invention hence relates to a plasmid 
comprising (a) a bacterial origin of replication and option- 60 
ally a marker gene; (b) the inverted repeat sequences of 
bacteriophage PI (loxP region); and (c), placed between said 
sequences (b), one or more genes of interest. 

According to another particular embodiment, 



recognition sequences of the resolvase of a transposon, or 
derived sequences. By way of preferred examples, there may 
be mentioned, in particular, the recognition sequences of the 
transposons Tn3, Tn21 and Tn522. By way of a preferred 
5 example, there may be mentioned the sequence SEQ ID No. 
15 or a derivative of the latter (see also Sherrat, P. 163-184, 
Mobile DNA, Ed. D. Berg and M. Howe, American Society 
for Microbiology, Washington D.C. 1989). 
According to another especially advantageous variant, the 
10 plasmids of the invention comprise, in addition, a multimer 
resolution sequence. This is preferably the mrs (multimer 
resolution system) sequence of the plasmid RK2. More 
preferably, the invention relates to a plasmid comprising: 

(a) a bacterial origin of replication and optionally a 
15 marker gene, 

(b) the attP and attB sequences of a bacteriophage, in the 
direct orientation, selected from the phages lambda, 
P22, *80, HPl and PI or of plasmid pSAM2 or the 2/< 
plasmid, or derived sequences; and, 

^° (c) placed between said sequences b), one or more genes 
of interest and the mrs sequence of plasmid RK2. 
This embodiment is especially advantageous. Thus, when 
plasmids pXL2649 or pXL2650 are brought into contact 
with the integrase of the bacteriophage in vivo, the 
sequences recombine to generate the minicircle and the 
miniplasmid, but also multimeric or topological forms of 
minicircle or of miniplasmid. It is especially advantageous 
to be able to decrease the concentration of these forms in 
order to increase the production and facilitate the purifica- 
tion of minicircle. 

The multimeric forms of plasmids are known to a person 
skilled in the art. For example, the cer fragment of ColEl 
(Summers et al., 1984 Cell 36 p. 1097) or the mrs site of the 
par locus of RK2 (L. Ebert 1994 Mol. Microbiol. 2 p. 131) 
permit the resolution of multimers of plasmids and partici- 
pate in an enhanced stability of the plasmid. However, 
whereas resolution at the cer site requires four proteins 
encoded by the E. colt genome (Colloms et al., 1990 J. 
Bacteriol. 172 p. 6973), resolution at the mrs site requires 
only the ParA protein for which the parAgene is mapped on 
the par locus of RK2. As a result, it would appear advan- 
tageous to use all or a portion of the par locus containing 
parA and the mrs sequence. For example, the mrs sequence 
may be placed between the attB and attP sequences of phage 
lambda, and the parA gene be expressed in trans or in cis 
from its own promoter or from an inducible promoter. In this 
connection, a particular plasmid of the invention comprises: 

(a) a bacterial origin of replication and optionally a 
marker gene, 

(b) the attP and attB sequences of a bacteriophage, in the 
direct orientation, selected from the phages lambda, 
P22, mo, HPl and PI or of plasmid pSAM2 or the 2^ 
plasmid, or derived sequences, 

(c) placed between said sequences b), one or more genes 
of interest and the mrs sequence of plasmid RK2, and 

(d) the parA gene of plasmid RK2. 
One such plasmid is, in particular, the plasmid pXL2960 

described in the examples. It may be employed, and can 
enable minicircle to be produced exclusively in monomeric 
form. 

According to another advantageous variant, the plasmids 
of the invention comprise two sets of site-specific recombi- 
nation sequences from a dififerent family. These advanta- 
genetic constructions of the present invention, the sequences 65 geously comprise a first set of integrase-dependent 
permitting site-specific recombination are derived from a sequences and a second set of parA-dependent sequences, 
transposon. More preferably, the sequences in question are The use of two sets of sequences enables the production 
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yields of minicircles to be increased when the first site- 
specific recombination is incomplete. Thus, when plasmids 
pXL2650 or pXL2960 are brought into contact with the 
integrase of the bacteriophage in vivo, the sequences recom- 
bine to generate the miniplasmid and the minicircle, but this 
reaction is not complete (5 to 10% of initial plasmid may be 
left). The introduction, in proximity to each of the att 
sequences of phage lambda, of an mrs sequence of RK2 
enables the production of minicircles to be increased. Thus, 
after induction of the integrase of phage lambda and Int- 
dependent recombination, the unrecombined molecules wiU 
be able to come under the control of the ParA protein of RK2 
and to recombine at the mis sites. Conversely, after induc- 
tion of the ParA protein and ParA-dependent recombination, 
the unrecombined molecules will be able to come under the 
control of the integrase of phage lambda and will be able to 
recombine at the att sites. Such constructions thus make it 
possible to produce minicircle and negligible amounts of 
unrecombined molecules. The att sequences, like the mrs 
sequences, are in the direct orientation, and the int and parA 
genes may be induced simultaneously or successively from 
the same inducible promoter or from two inducible promot- 
ers. Preferably, the sequences in question are the attB and 
attP attachment sequences of phage lambda in the direct 
orientation and two mrs sequences of RK2 in the direct 
orientation. 

As stated above, another aspect of the present invention 
lies in a method for the production of therapeutic DNA 
molecules defined above, by excision, from a plasmid or 
chromosome, by site-specific recombination. 

Another subject of the present invention hence lies in a 
method for the production of a DNA molecule (minicircle) 
as defined above, according to which a culture of host cells 
containing a recombinant DNA as defined above is brought 
into contact with the recombinase enabling site-specific 
recombination to be induced. More preferably, the culture 
and recombinase are brought into contact either by trans- 
fection or infection with a plasmid or a phage containing the 
gene for said recombinase; or by induction of the expression 
of a gene coding for said recombinase, present in the host 
cell. As mentioned below, this gene may be present in the 
host cell in integrated form in the genome, on a replicative 
plasmid or alternatively on the plasmid of the invention, in 
the non-therapeutic portion. 

To permit the production of the minicircles according to 
the invention by site-specific recombination in vivo, the 
recombinase used must be introduced into, or induced in, 
cells or the culture medium at a particular instant. For this 
purpose, different methods may be used. According to a first 
method, a host cell is used containing the recombinase gene 
in a form permitting its regulated expression. It may, in 
particular, be introduced under the control of a promoter or 
of a system of inducible promoters, or alternatively in a 
temperature-sensitive system. In particular, the gene may be 
present in a temperature-sensitive phage, latent during the 
growth phase, and induced at a suitable temperature (for 
example lysogenic phage lambda Xis" cl857). The cassette 
for expression of the recombinase gene may be carried by a 
plasmid, a phage or even by the plasmid of the invention, in 
the non-therapeutic region. It may be integrated in the 
genome of the host cell or maintained in rephcative form. 
According to another method, the cassette for expression of 
the gene is carried by a plasmid or a phage used to transfect 
or infect the cell culture after the growth phase. In this case, 
it is not necessary for the gene to be in a form permitting its 
regulated expression. In particular, any constitutive pro- 
moter may be used. The cell may also be brought into 
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contact with the recombinase in vitro, on a plasmid 
preparation, by direct incubation with the protein. 

It is preferable, in the context of the present invention, to 
use a host cell capable of expressing the recombinase gene 

5 in a regulated manner. This embodiment, in which the 
recombinase is supplied directly by the host cell after 
induction, is especially advantageous. In effect, it suffices 
simply to place the cells in culture at the desired time under 
the conditions for expression of the recombinase gene 

10 permissive temperature for a temperature-sensitive gene, 
addition of an inducer for a regulable promoter, and the like) 
in order to induce the site-specific recombination in vivo and 
thus the excision of the minicircle of the invention. In 
addition, this excision takes place in especially high yields, 

15 since all the cells in culture express the recombinase, which 
is not necessarily the case if a transfection or an infection has 
to be carried out in order to transfer the recombinase gene. 

According to a first embodiment, the method of the 
invention comprises the excision of the molecules of thera- 

20 peutic DNA by site-specific recombination from a plasmid. 
Tbis embodiment employs the plasmids described above 
permitting, in a first stage, replication in a chosen host, and 
then, in a second stage, the excision of the non-therapeutic 
portions of said plasmid (in particular the origin of repUca- 

25 tion and the resistance gene) by site-specific recombination, 
generating the circular DNA molecules of the invention. To 
carry out the method, different types of plasmid may be 
used, and especially a vector, a phage or a virus. A replica- 
tive vector is preferably used. 

30 Advantageously, the method of the invention comprises a 
prior step of transformation of host cells with a plasmid as 
defined above, followed by culturing of the transformed 
cells, enabling suitable amounts of plasmid to be obtained. 
Excision by site-specific recombinations is then carried out 

35 by bringing into contact with the recombinase under the 
conditions defined above (FIG. 2). As stated above, in this 
embodiment, the site-specific recombination may be carried 
out in vivo, (that is to say in the host cell) or in vitro (that 
is to say on a plasmid preparation). 

40 According to a preferred embodiment, the DNA mol- 
ecules of the invention are hence obtained from a replicative 
vector, by excision of the non-therapeutic portion carrying, 
in particular, the origin of replication and the marker gene, 
by site-specific recombination. 

45 According to another embodiment, the method of the 
invention comprises the excision of the DNA molecules 
fi-om the genome of the host cell by site-specific recombi- 
nation. This embodiment is based more especially on the 
construction of cell hosts comprising, inserted into their 

50 genome, one or more copies of a cassette comprising the 
gene of interest flanked by the sequences permitting recom- 
bination (FIG. 1). Different techniques may be used for 
insertion of the cassette of the invention into the genome of 
the host cell. In particular, insertion at several distinct points 

55 of the genome may be obtained by using integrative vectors. 
In this connection, different transposition systems such as, in 
particular, the miniMu system or defective transposons such 
as TnlO derivatives, for example, may be used (Kleckner et 
al.. Methods Enzymol. 204 (1991) 139; Groisman E., Meth- 

60 ods Enzymol. 204 (1991) 180). The insertion may also be 
carried out by homologous recombination, enabling a cas- 
sette containing two recombination sequences in the direct 
orientation flanking one or more genes of interest to be 
integrated in the genome of the bacterium. This process may, 

65 in addition, be reproduced as many times as desired so as to 
have the largest possible number of copies per cell. Another 
technique also consists in using an in vivo amplification 
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system using recombination, as described in Labarre et al. 
(Labarre J., 0. Reyes, Guyonvarch, and G. Leblon. 1993. 
Gene replacement, integration, and amplification at the 
gdhA locus of Corynebacterium glutamicum. J. Bacteriol. 
175:1001-107), so as to augment from 
cassette to a much larger number. 

A preferred technique consists in the use of miniMu. To 
this end, miniMu derivatives are constructed comprising a 
resistance marker, the functions required in cis for their 
transposition and a cassette containing two recombination 
sequences in the direct orientation flanking the gene or genes 
of interest. These miniMus are advantageously placed at 
several points of the genome using a resistance marker 
(kanamycin, for example) enabling several copies per 
genome 

described above, the host cell in question can also express 
inducibly a site-specific recombinase leading to the excision 
of the fragment flanked by the recombination sequences in 
the direct orientation. After excision, the minicircles may be 
purified by standard techniques. 20 

This embodiment of the method of the invention is 
especially advantageous, since it leads to the generation of 
a single type of plasmid molecule: the minicircle of the 
invention. The cells do not contain, in effect, any other 
episomal plasmid, as is the case during production from a is 
plasmid (FIGS. 1 and 2). 

Another subject of the invention also lies in a modified 
host cell comprising, inserted into its genome, one or more 
copies of a recombinant DNA as defined above. 

The invention also relates to any recombinant cell con- 30 
taining a plasmid as defined above. These cells are obtained 
by any technique known to a person skilled in the art 
enabling a DNA to be introduced into a given cell. Such a 
technique can be, in particular, transformation, 
electroporation, conjugation, protoplast fusion or any other 35 
technique known to a person skilled in the art. As regards 
transformation, different protocols have been described in 
the prior art. In particular, cell transformation may be carried 
out by treating whole cells in the presence of lithium acetate 
and polyethylene glycol according to the technique 40 
described by Ito et al. (J. Bacteriol. 153 (1983) 163-168), or 
in the presence of ethylene glycol and dimethyl sulphoxide 
according to the technique of Durrens et al. (Curr. Genet. 18 
(1990) 7). An alternative protocol has also been described in 
Patent Application EP 361,991. As regards electrqjoration, 45 
this may be carried out according to Becker and Guarentte 
(in: Methods in Enzymology Voll94 (1991) 182). 

The method according to the invention may be carried out 
in any type of cell host. Such hosts can be, in particular, 
• bacteria or eukaryotic cells (yeasts, animal cells, plant cells), 50 
and the like. Among bacteria, E.coli, B. subtilis, 
Streptomyces, Pseudomonas (P. putida, P. aeruginosa), 
Rhizobium meliloti, Agrobacterium tuniefaciens, Staphylo- 
coccus aureus, Streptomyces pristinaespiralis, Enterococcus 
faecium or Clostridium, and the like, may be mentioned 55 
more preferentially. Among bacteria, it is preferable to use 
E.coli. Among yeasts, Kluyveromyces, Saccharomyces, 
Pichia, Hansenula, and the like, may be mentioned. Among 
mammahan animal cells, CHO, COS, NIH3T3, and the hke, 
cells may be mentioned. 60 

In accordance with the host used, the plasmid according 
to the invention is adapted by a person skilled in the art to 
permit its replication. In particular, the origin of replication 
and the marker gene are chosen in accordance with the host 
cell selected. 65 

The marker gene may be a resistance gene, in particular 
for resistance to an antibiotic (ampiciUin, kanamycin. 



geneticin, hygromycin, and the like), or any gene endowing 
the cell with a function which it no longer possesses (for 
example a gene which has been deleted on the chromosome 
or rendered inactive), the gene on the plasmid reestablishing 
copy of the 5 this function. 

In a particular embodiment, the method of the invention 
comprises an additional step of purification of the 
minicircle. 

In this connection, the minicircle may be purified by 
standard techniques of plasmid DNA purification, since it is 
supercoiled like plasmid DNA. These techniques comprise, 
inter alia, purification on a cesium chloride density gradient 
in the presence of ethidium bromide, or alternatively the use 
of anion exchange columns (Maniatis et al., 1989). In 
\7\^\Zlcr{Gro^mm£ci^^A above). A^ 15 addition, if the plasmid DNA corresponding to the non- 
• - -- • - therapeutic portions (origin of replication and selectable 

marker in particular) is considered to be present in an 
excessively large amount, it is also possible, after or before 
the purification, to use one or more restriction enzymes 
which will digest the plasmid and not the minicircle, 
enabling them to be separated by techniques that separate 
supercoiled DNA from linear DNA, such as a cesium 
chloride density gradient in the presence of ethidium bro- 
mide (Maniatis et al., 1989). 

In addition, the present invention also describes an 
improved method for the purification of minicircles. This 
method enables minicircles of very great purity to be 
obtained in large yields in a single step. This improved 
method is based on the interaction between a double- 
stranded sequence present in the minicircle and a specific 
Ugand. The ligand can be of various natures, and in particu- 
lar protein, chemical or nucleic acid in nature. It is prefer- 
ably a ligand of the nucleic acid type, and in particular an 
oligonucleotide, optionally chemically modified, capable of 
forming by hybridization a triple helix with the specific 
sequence present in the DNA molecule of the invention. It 
was, in effect, shown that some oligonucleotides were 
capable of specifically forming triple helices with double- 
stranded DNA sequences (H61fene et al., Biochim. Biophys. 
Acta 1049 (1990) 99; see also FR 94/15162 incorporated in 
the present application by reference). 

In an especially advantageous variant, the DNA mol- 
ecules of the invention hence contain, in addition, a 
sequence capable of interacting specifically with a ligand 
(FIG. 3). Preferably, it is a sequence capable of forming, by 
hybridization, a triple helix with a specific oligonucleotide. 
This sequence may be positioned at any site of the DNA 
molecule of the invention, provided it does not affect the 
functionality of the gene of interest. This sequence is also 
present in the genetic constructions of the invention 
(plasmids, cassettes), in the portion containing the gene of 
interest (see, in particular, the plasmid pXL2650). 
Preferably, the specific sequence present in the DNA mol- 
ecule of the invention comprises between 5 and 30 base 
pairs. 

The oligonucleotides used for carrying out the method 
according to the invention can contain the following bases: 
thymidine (T), which is capable of forming triplets with 
A.T doublets of double-stranded DNA (Rajagopal et 
al., Biochem 28 (1989) 7859); 
adenine (A), which is capable of forming triplets with A.T 

doublets of double-stranded DNA; 
guanine (G), which is capable of forming triplets with 

G.C doublets of doubled-stranded DNA; 
protdnated cytosine (C+), which is capable of forming 
triplets with G.C doublets of doubled-stranded DNA 
(Rajagopal et al., cited above). 
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Preferably, the oligonucleotide used comprises a homopy- To permit its covalent coupling to the support, the ligand 

rimidine sequence containing cytosines, and the specific is generally functionalized. In the case of an ohgonucleotide, 

sequence present in the DNA molecule is a homopurine- this may be modified, for example, with a terminal thiol, 

homopyrimidine sequence. The presence of cytosines makes amine or carboxyl group at the 5' or 3' position. In particular, 

it possible to have a triple helix which is stable at acid pH s the addition of a thiol, amine or carboxyl group makes it 

where the cytosmes are protonated, and destabilized at possible, for example, to couple the oHgonucleotide to a 

alkaline pH where the cytosmes are neutrahzed. i carrying disulphide, maleimide, amine, carboxyl. 

To permit the formation of a tnple helix by hybridization, ^ ^^^^^^ „^ ^^^^^ ^^ fiinctions. 

It IS important for the oligonucleotide and the specific couplings form by the estabhshment of disulphide, 

sequence present m the DNA molecule of the invention to be „ ,u . j • i - i u . ,u i- 

complementary. In this connection, to obtain the best yields " °' ^f" '''tT, 

and best selectivity, an oligonucleotide and a specific nucleotide and the support. Aiiy other method known to a 

sequence which are fully complementary are used in the P^'^^" ^^lUed m the art may be used, such as bifunctional 

method of the invention. Possible combinations are, in couphng reagents, for example. 

particular, a poly(CTT) oligonucleotide and a poly(GAA) Moreover, to improve the activity of the coupled 

specific sequence. By way of example, there may be men- ^5 oHgonucleotide, it may be advantageous to perform the 

tioned the oligonucleotide of sequence GAGGCnCTTCT- coupling by means of an "arm". Use of an arm makes it 

TCTTCncnCTr (SEQ id No. 5), in which the bases possible, in effect, to bind the oligonucleotide at a chosen 

GAGG do not form a triple helix but enable the oligonucle- distance from the support, enabling its conditions of inter- 

otide to be spaced apart from the coupling arm. action with the DNA molecule of the invention to be 

It is understood, however, that some mismatches may be 20 improved. The arm advantageously consists of nucleotide 

tolerated, provided they do not lead to too great a loss of bases that do not interfere with the hybridization. Thus, the 

affinity. The oligonucleotide used may be natural (composed arm may comprise purine bases. By way of example, the arm 

of unmodified natural bases) or chemically modified. In may comprise the sequence GAGG. 

particular, the oligonucleotide may advantageously possess The DNA molecules according to the invention may be 

some chemical modifications enabling its resistance or its 25 used in any application of vaccination or of gene and cell 

protection against nucleases, or its afBnity for the specific therapy, for the transfer of a gene to a body, a tissue or a 

sequence, to be increased. given cell. In particular, they may be used for a direct 

Thus, the oligonucleotide may be rendered more resistant administration in vivo, or for the modification of cells in 

to nucleases by modification of the skeleton (e.g. vitro or ex vivo with a view to their implantation in a patient, 

methylphosphonates, phosphorothiates, phosphotriester, 30 In this connection, the molecules according to the invention 

phosphoramidate, and the like). Another type of modifica- may be used as they are (in the form of naked DNA), or in 

tion has as its objective, more especially, to improve the combination with different synthetic or natural, chemical 

interaction and/or the afBnity between the oligonucleotide and/or biochemical vectors. The latter can be, in particular, 

and the specific sequence. In particular, a thoroughly advan- cations (calcium phosphate, DEAE-dextran, etc.) which act 

tageous modification according to the invention consists in 35 by forming precipitates with DNA, which precipitates can 

methylating the cytosines of the oligonucleotide. The oli- be "phagpcytosed" by the cells. They can also be liposomes 

gonucleotide thus methylated displays the noteworthy prop- in which the DNA molecule is incorporated and which fuse 

erty of forming a stable triple helix with the specific with the plasma membrane. Synthetic gene transfer vectors 

sequence at neutral pH. Hence it makes it possible to work are generally lipids or cationic polymers which complex 

at higher pH values than the oligonucleotides of the prior art, 40 DNA and form a particle therewith carrying positive surface 

that is to say at pH values where the risks of degradation of charges. These particles are capable of interacting with the 

the plasmid DNA are lower. negative charges of the cell membrane and then of crossing 

The length of the oligonucleotide used in the method of the latter. DOGS (Transfectam™) or DOTMA 

the invention is at least 3 bases, and preferably between 5 (Lipofectin™) may be mentioned as examples of such 

and 30. An oligonucleotide of length greater than 10 bases 45 vectors. Chimeric proteins have also been developed: they 

is advantageously used. The length may be adapted to each consist of a polycationic portion which condenses DNA, 

individual case by a person skilled in the art in accordance linked to a ligand which binds to a membrane receptor and 

with the desired selectivity and stability of the interaction. carries the complex into the cells by endocytosis. The DNA 

The oligonucleotides according to the invention may be molecules according to the invention may also be used for 

synthesized by any known technique. In particular, they may 50 gene transfer into cells by physical transfection techniques 

be prepared by means of nucleic acid synthesizers. It is quite such as bombardment, electroporation, and the like. In 

obvious that any other method known to a person skilled in addition, prior to their therapeutic use, the molecules of the 

the art may be used. invention may optionally be linearized, for example by 

To carry out the method of the invention, the specific enzymatic cleavage, 
ligand (protein, nucleic acid, and the like) may be grafted or 55 In this connection, another subject of the present inven- 
otherwise onto a support. Different types of supports may be tion relates to any pharmaceutical composition comprising 
used for this purpose, such as, in particular, functionalized at least one DNA molecule as defined above. This molecule 
chromatography supports, in bulk form or prepacked in may be naked or combined with a chemical and/or bio- 
columns, functionalized plastic surfaces or functionalized chemical transfection vector. The pharmaceutical composi- 
latex beads, magnetic or otherwise. Chromatography sup- 60 tions according to the invention may be formulated with a 
ports are preferably used. By way of example, the chroma- view to topical, oral, parenteral, intranasal, intravenous, 
tography supports which may be used are agatose, acryla- intramuscular, subcutaneous, intra-ocular, transdermal, and 
mide or dextran, as well as their derivatives (such as the like, administration. Preferably, the DNA molecule is 
Sephadex, Sepharose, Superose, etc.), polymers such as used in an injectable form or by application. It may be mixed 
poly(styrenedivinylbenzene), or grafted or ungrafted silica, 65 with any pharmaceutically acceptable vehicle for an inject- 
for example. The chromatography columns can function in able formulation, in particular for a direct injection at the site 
the diffusion or perfusion mode. to be treated. The compositions can be, in particular, in the 
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form of isotonic sterile solutions, or of dry, in particular represses the transcription of a gene, specifically or 

lyophilized compositions which, on addition of sterilized otherwise, inducibly or otherwise, strongly or weakly. They 

water or physiological saline as appropriate, enable inject- can be, in particular, ubiquitous promoters (promoter of the 

able solutions to be made up. Diluted Tris or PBS buffers in HPRT, PGK, a-actin, tubulin, and the like, genes), promot- 

glucose or sodium chloride may be used in particular A s ers of intermediate filaments (promoter of the GFAP, 

direct injection of the nucleic acid into the affected region of desmin, vimentin, neurofilament, keratin, and the like, 

the patient is advantageous, since it enables the therapeutic genes), promoters of therapeutic genes (for example the 

effect to be concentrated in the tissues affected. The doses of promoter of the MDR, CFTR, factor VIII, ApoAl, and the 

nucleic acid used may be adapted in accordance with like.genes), tissue-specific promoters (promoter of the pyru- 

different parameters, and in particular in accordance with the lo vate kinase gene, villin gene, gene for intestinal fatty acid 

gene, the vector, the mode of administration used, the binding protein, gene for a-actin of smooth muscle, and the 

pathology in question or alternatively the desired treatment like) or alternatively promoters that respond to a stimulus 

period. (steroid hormone receptor, retinoic acid receptor, and the 

The DNA molecules of the invention may contain one or like). Similarly, the promoter sequences may be those origi- 

more genes of interest, that is to say one or more nucleic 15 nating firom the genome of a virus, such as, for example, the 

acids (cDNA, gDNA, synthetic or semi-synthetic DNA, and promoters of the adenovirus ElA and ML? genes, the CM V 

the like) whose transcription and, where appropriate, trans- early promoter or alternatively the RSV LTR promoter, and 

lation in the target cell generate products of therapeutic, the like. In addition, these promoter regions may be modi- 

vaccinal, agricultural or veterinary value. fied by the addition of activator or regulator sequences or 

Among the genes of therapeutic value, there may be 20 sequences permitting a tissue-specific or -preponderant 

mentioned, more especially, the genes coding for enzymes, expression. 

blood derivatives, hormones, lymphokines, namely Moreover, the gene of interest can also contain a signal 
interleukins, interferons, TNF, and the like (FR 92/03120), sequence directing the synthesized product into the path- 
growth factors, neurotransmitters or their precursors or ways of secretion of the target cell. This signal sequence can 
synthetic enzymes, trophic factors, namely BDNF, CNTF, 25 be the natural signal sequence of the product synthesized, 
NGF, IGF, GMF, aFGF, bFGF, NTS, NTS, and the like; but it can also be any other functional signal sequence, or an 
apolipoproteins, namely ApoAI, ApoAIV, ApoE, and the artificial signal sequence. 

like (FR 93/05125), dystrophin or a minidystrophin (FR Depending on the gene of interest, the DNA molecules of 

91/11947), tumour suppressive genes, namely p53, Rb, the invention may be used for the treatment or prevention of 

RaplA, DCC, k-rev, and the like (FR 93/04745), genes 30 a large number of pathologies, including genetic disorders 

coding for factors involved in coagulation, namely factors (dystrophy, cystic fibrosis, and the like), neurodegenerative 

VII, VIII, IX, and the like, suicide genes, namely thymidine diseases (Alzheimer's, Parkinson's, ALS, and the like), 

kinase, cytosine deaminase, and the like; or alternatively all cancers, pathologies associated with disorders of coagula- 

or part of a natural or artificial immunoglobulin (Fab, ScFv, tion or with dyshpoproteinaemias, pathologies associated 

and the like), a ligand RNA (W091/19813), and the like. 35 with viral infections (hepatitis, AIDS, and the like), or in the 

The therapeutic gene can also be an antisense gene or agricultural and veterinary fields, and the like, 

sequence whose expression in the target cell enables gene The present invention will be described more completely 

expression or the transcription of cellular mRNAs to be by means of the examples which follow, which are to be 

controlled. Such sequences can, for example, be transcribed regarded as illustrative and non-limiting, 

in the target cell into RNAs complementary to cellular 40 

mRNAs, and can thus block their translation into protein, BRIEF DESCRIPTION OF THE FIGURES 

according to the technique described in Patent EP 140,308. FIG. 1: Production of a minicircle fiom a cassette inte- 

The gene of interest can also be a vaccinating gene, that grated in the genome, 

is to say a gene coding for an antigenic peptide, capable of piG. 2: Production of a minicircle from a plasmid. 

generatmg an immune response m man or anmials for the 45 tj./-. i t,„j c i . • ■ 

° % . J o 1. • .-J FIG. 3: Production of a minicircle containing a sequence 

purpose of vaccine production. Such antigemc peptides can specific to a lieand 

be, in particular, those specific to the Epstein-Bair virus, the „,„ . „ \ 

HIV virus, the hepatitis B virus (EP 185,573) or the pseu- ^^-f Construction of pXL2649. On: Origin of reph- 

dorabies virus, or alternatively tumour-specific peptides (EP '=?t'°°' ^ barker gene conferring resistance to kanamy- 

259 212) 50 • Marker gene conferrmg resistance to ampiciUm; 

Generally, in the plasmids and molecules of the invention, f Galactosidase gene of E.coli; Plac: Promoter of the 

the gene of therapeutic, vaccinal, agricultural or veterinary lactose operon. 

value also contains a transcription promoter region which is PIG- ^- Luciferase activity obtained after transfection of 
functional in the target cell or body (i.e. mammals), as well NIH3T3 mouse fibroblasts with plasmid pXL2650, the 
as a region located at the 3' end and which specifies a 55 minicircle generated firom plasmid pXL2650 and PGL2- 
transcription termination signal and a polyadenylation site Control (Promega, Biotech). The transfection was carried 
(expression cassette). As regards the promoter region, this out under the following conditions: 0.5 mg of DNA per weU, 
can be a promoter region naturally responsible for the 50,000 cells per well. The Hpofectant used is RPR 115335. 
expression of the gene in question when the latter is capable The result is recorded in RLU per microgram of proteins as 
of functioning in the ceU or body in question. The promoter 60 a fiinction of the hpofectant/DNA charge ratio, 
regions can also be those of different origin (responsible for FIG. 6: Construction of the plasmid pXL2793. This 
the expression of other proteins, or even synthetic plasmid generates, after recombination, a minicircle con- 
promoters). In particular, the promoter sequences can be taining a synthetic homopurine-homopyrimidine sequence 
from eukaryotic or viral genes. For example, they can be and the luciferase cassette of pXL2727. 
promoter sequences originating from the genome of the 65 FIG. 7: Well 1 corresponds to the Sail digestion of the 
target cell. Among eukaryotic promoters, it is possible to use firaction eluted after purification with a triple-helix column, 
any promoter or derived sequence that stimulates or Well 2 corresponds to the XmnI digestion of the fraction 
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eluted after purification with a triple-helix column. Well 3 
corresponds to the undigested fraction eluted after purifica- 
tion with a triple-helix column. Well 4 corresponds to 
uninduced, undigested plasmid pXL2793. Wells 5 and 6 
correspond, respectively, to the linear DNA and supercoiled : 
DNA size markers. 

FIG. 8: Diagrammatic description of the construction of 
the plasmid pXL2776. 

FIG. 9: Diagrammatic description of the constructions of 
the plasmids pXL2777 and pXL2960. ^ 

FIG. 10: Action of the integrase of bacteriophage 1 in E. 
coli on plasmids pXLZm and pXL2960. M: linear DNA or 
supercoiled DNA 1 kb molecular weight marker. N.I.: not 
induced. I: induced. N.D.: not digested. ^ 

FIG. 11: Kinetics of recombination of the integrase of 
bacteriophage 1 in E. coli on plasmids pXL2777 and 
pXL2960. 2': 2 minutes. 0/N: 14 hours. M: linear DNA or 
supercoiled DNA 1 kg molecular weight marker. N.I.: not 
induced. I: induced. N.D.: not digested. 2 

General techniques of cloning and molecular biology. 

The standard methods of molecular biology, such as 
centrifugation of plasmid DNA in a cesium chloride- 
ethidium bromide gradient, digestion with restriction 



dNTP: 2'-deoxyribonucleoside 5'-triphosphates 
DTT: dithiothreitol 



enzymes, gel electrophoresis, electroelution of DNA frag- 25 of pNH16a ^las 
ments from agarose gels, transformation in E.coli, precipi- °^ DH5a, 

tation of nucleic acids, and the like, are described in the 
literature (Maniatis et al., 1989, Ausubel et al., 1987). 
Nucleotide sequences were determined by the chain termi- 
nation method according to the protocol already put forward 
(Ausubel et al., 1987). 

Restriction enzymes were supplied by New-England 
Biolabs (Biolabs), Bethesda Research Laboratories (BRL) 
or Amersham Ltd. (Amersham). 

To carry out ligation, DNA fragments are separated 
according to their size on 0.7 % agarose or 8% acrylamide 
gels, purified by electrophoresis and then electroelution, 
extracted with phenol, precipitated with ethanol and then 
incubated in a buffer comprising 50 mM Tris-HCl, pH 7.4, 
10 mM MgClz, 10 mM, DTT, 2 mM ATP in the presence of 
phage T4 DNAligase (Biolabs). Ohgo-nucleotides are syn- 
thesized using phosphoramidite chemistry with the latter 
derivatives protected at the b position by a cyanoethyl group 
(Sinha et al., 1984, Giles 1985), with the Biosearch 8600 
automatic DNA synthesizer, using the manufacturer's rec- 
ommendations. 

The ligated DNAs are used to transform the following 
strains rendered competentf: E.coli MC1060 
[(LacIOPZYA)X74, galU, galK, strA^ hsdR] (Casadaban et 
al., 1983); HBlOl [hsdS20, supE44, recA13, ara-14, proA2, 
lacYl, galK2, rpsL20, xyl-5, mtl-1, F-] (Maniatis et al., 
1989); and DH5a [endAl hsdR 17 supE44 thi-1 recAl 
gyrA96 relAl X-mO dlacZAM15] for the plasmids. 

LB and 2XTY culture media are used for the bacterio- 55 
logical part (Maniatis et al., 1989). 

Plasmid DNAs are purified according to the alkaline lysis 
technique (Maniatis et al., 1989). 

Definition of the terms employed and abbreviations. 
Recombinant DNA: set of techniques which make it 
possible either to combine, within the same microorganism, 
DNA sequences which are not naturally combined, or to 
mutagenize a DNA fragment specifically. 
ATP: adenosine 5'-triphosphate 
BSA: bovine serum albumin 

PBS: 10 mM phosphate buffer, 150 mM NaCl, pH 7.4 



bp: base pairs 

EXAMPLE 1 

Construction of a Plasmid Carrying the attP and attB 
Sequences of the Bacteriophage, in Repeated Direct Orien- 
tations. 

The plasmid pNH16a was used as starting material, 
inasmuch as it already contains a fragment of bacteriophage 
X carrying the attP sequence (Hasan and Szybalski, 1987). 
This plasmid was digested with EcoRI. Oligonucleotides 
which contain the attB sequence (Landy, 1989) were syn- 
thesized. They have the following sequence: 

Oligonucleotide 5476 (SEQ ID No.l) 
5'-AATTGTGAAGCCTGCTTTTTTATACTAAC 
TTGAGCGG-3' 
Oligonucleotide 5477 (SEQ ID No. 2) 
5'-AATTCCGCTCAAGTTAGTATAAAAAAGCA 
GGCTTCAC-S' 
They were hybridized to re-form the attB sequence and 
then ligated at the EcoRI site of the 4.2-kb EcoRI fragment 



and S2yba]ski, 1987). After transforma- 
recombinant clone was retained. The 
plasmid thereby constructed was designated pXL2648 (see 
FIG. 4). This plasmid contains the attP and attB sequences 
of the bacteriophage in the direct orientation. Under the 
30 action of the integrase of the bacteriophage (Int protein), 
there should be excision of the sequences lying between the 
two att sites. This results in separation of the material 
inserted between the two att sequences from the origin of 
replication and from the resistance marker of the plasmid, 
35 which are positioned on the outside. 

EXAMPLE 2 
Obtaining a Minicircle in vivo in E.coli. 
A cassette for resistance to kanamycin was cloned at the 
40 EcoRI site of plasmid pXL2648 (FIG. 4). This cassette 
originates from the plasmid pUC4KIXX (Pharmacia 
Biotech.). For this purpose, 10 /<g of plasmid pUC4KIXX 
were digested with EcoRI and then separated by agarose gel 
electrophoresis; the 1.6-kb fragment containing the kana- 
45 mycin resistance marker was purified by electro-elution; it 
was then ligated to plasmid pXL2648 linearized with EcoRI. 
The recombinant clones were selected after transformation 
into£.co/i DH5a and selection for resistance to kanamycin. 
The expected restriction profile was observed on one clone; 
50 this plasmid clone was designated pXL2649 (FIG. 4). This 
plasmid was introduced by transformation into two E.coli 
strains: 

D1210 [hsdS20, supE44, recA13, ara-14, proA2, lacYl, 
galK2, rpsL20, xyl-5, mtl-1, X", F-, laclg] (Sadler et 
al., 1980). 

D1210HP, which corresponds to DH1210 lysogenized 
with the phage xis" (Xis" Kil") cI857 (Podjaska et al., 
1985). The D1210HP strain [supE44 ara-14 galK2 
A(gpt-proA)62 rpsL20 xyl5 mtll recA13 A(mcrC-mrr) 
hsdS lacF] (X[cl857 xis" kil"]), accession number 
1-2314, was deposited on Sep. 15, 1999 with the 
Collection National de Cultures de Microorganisms 
(CNCM), Institut Pasteur, 25 rue du Docteur Roux, 
F-75724 Paris C6dex 15, FRANCE. 
The transformants were selected at 30° C. on 2XTY 
medium with kanamycin (50 mg/1). After reisolation on 
selective medium, the strains were inoculated into 5 ml of L 
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medium supplemented with kanaraycin (50 mg/1). After 16 A 1-liter culture of the strain D1210HP pXL2650 in 

h of incubation at 30° C. with agitation (5 cm of rotational 2XTY medium supplemented with ampicillin (50 mg/ml) 

amplitude), the cultures were diluted to 1/100 in 100 ml of was set up at 30° C. At an ODg-iQ equal to 0.3, the culture was 

the same medium. These cultures were incubated under the transferred to 42° C. for 20 min, then replaced for 20 min at 
same conditions until an OB^o of 0.3 was reached. At this 5 30° C. The episomal DNA was prepared by the clear lysate 

point, half of the culture was removed and then incubated for technique (Maniatis et al., 1989), followed by a cesium 

10 min at 42° C. to induce the lytic cycle of the phage, hence chloride density gradient supplemented with ethidium bro- 

the expression of the mtegrase. After this mcubation, the j^j^j^ (Maniatis et al 1989) then by an extraction of the 

cultures were transferred again to 30° C and then mcubated ^^^^^^ 1,^^^^^;^^ isopropanol and by a dialysis. Hiis 

for 1 h under these conditions. Next cultunng was stopped ^^^^^ ^^^^^^ minicircle. 100 m of this 

and minipreparations of plasmid DNA were produced. Irre- ,. . j -.u n .t j .u u j 1 . 

spective of the conditions, in the strain D1210, the agarose Preparation were digested with PstI, and the hydrolysate was 

gelelectrophoresisprofileoftheundigestedplasmidDNAof then subjected to acesium chloride density gradient supple- 

plasmid PXL2649 is unchanged, as is also the case in the ^ith ethidium bromide (Mamatis et al., 1989). An 

strain D1210HP which has not been thermaUy induced. On identical result is obtained when the preparation is digested 
the contrary, in D1210HP which has been incubated for 10 ^5 jointly with AlwNI and Xmnl. The supercoiled form was 

min at 42° C. and then cultured for 1 hour at 30° C, it is recovered and, after removal of the ethidium bromide 

found that there is no longer a plasmid, but two circular (Maniatis et al.), it was found to correspond only to the 

DNA molecules: one of low molecular weight, migrating minicircle, lacking an origin of replication and any marker 

faster and containing an EcoRI site; and one of higher gene. This minicircle preparation may be used for in vitro 
molecular weight, containing a unique Bgll site, as 20 and in vivo transfection cjcperiments. 

expected. Hence there has indeed been excision of the EXAMPLE 4 

sequences present between the two att sequences, and gen- , . ^ ^ 

eration of a minicircle bereft of any origin of rephcation. I" yitro Transfection of Mamma lan Cells, and More Espe- 

This supercoiled circular DNA not carrying an origin of "ally of Human Cells, with a Minicircle. 

replication is termed a minicircle. This name takes, in effect, 25 minicircle DNA contaimng the luciferase gene of 

better account of the circular nature of the molecule. The Photinus pyralis as described in Example 3, that is to say 

starting plasmid pXL2649 is present, but it represents corresponding to the minicircle generated from plasmid 

approximately 10% of the plasmid which has excised the pXL2650, is diluted in 150 mM NaCl and mixed with a 

sequences flanked by att. transfectant. It is possible to use various commercial 

The minicircle may then be purified by standard tech- transfectants, such as dioctadecylamidoglycylspermine 

niques of plasmid DNA purification, since it is supercoiled (DOGS, Transfectam™, Promega), Lipofectin™ (Gibco- 

like plasmid DNA. These techniques comprise, inter alia, BRL), and the like, in different positive/negative charge 

purification on a cesium chloride density gradient in the ratios. By way of illustration, the transfecting agent was 

presence of ethidium bromide, or alternatively the use of used in charge ratios greater than or equal to 3. The mixture 

anion exchange columns (Maniatis et al., 1989). In addition, is vortexed, left for 10 minutes at room temperature, diluted 

if the plasmid DNA corresponding to the origin of rephca- in culture medium without fetal calf serum, and then added 

tion and to the selectable marker is considered to be present to the cells in the proportion of 2/^ of DNAper culture well, 

in an excessively large amount, it is always possible, after The cells used are Caco-2, derived from a human colon 

purification, to use one or more restriction enzymes which adenocarcinoma, cultured according to a protocol described 
will digest the plasmid and not the minicircle, enabling them ^ (Wils et al., 1994) and inoculated on the day before the 

to be separated by techniques that separate supercoiled DNA experiment into 48-well culture plates in the proportion of 

from linear DNA, such as in a cesium chloride density 50,000 cells/well. After two hours at 37° C, 10% v/v of fetal 

gradient in the presence of ethidium bromide (Maniatis et calf serum is added and the cells are incubated for 24 hours 

al., 1989). at 37° C. in the presence of 5% CO^. The cells are washed 
EXAMPLE 3 45 'wice with PBS and the luciferase activity is measured 

. . ^ . . _ r according to the protocol described (such as the Promega 

Obtaining a Mimcircle Contaimng a Cassette for the Expres- ^.^^ ^.^ ^^^^^ j.^^^ (fibroblasts, 

sion of Luciferase. , ' , . . \ ■ • c ^-cc . 

, f .u ..... lymphocytes, etc.) onginatmg from different species, or 

In order to test the use of these minicircles in vivo, a i. 1 n . 1 £. • j- -j 1 /cL ui . 

■ J r ■. • alternatively cells taken firom an individual (fibroblasts, 
reporter gene with the sequences required for its expression ^eratinocytes, lymphocytes, etc.) and which will be rein- 
was cloned into plasmid pXL2649 (see Example 2). This 50 ^^^^^^ after transfection. 
was done using, more especially, a 3150-bp Bqlll-BamHI •* 

cassette originating from pGL2-Control (Promega Biotech). EXAMPLE 5 

This cassette contains the SV40 early promoter, the In vitro Transfection of NIH 3T3 Cells, 

enhancer of the SV40 early promoter, the luciferase gene of The minicircle DNA containing the luciferase gene of 
Photinus pyralis and a polyadenylation site derived from 55 Photinus pyralis, as described in Example 3, that is to say 

SV40. The 3150-bp Bglll-BamHl fragment was cloned at corresponding to the minicircle generated from plasmid 

the BamHI site of pXL2649 digested with BamHI so as to pXL2650, was transfected in vitro into mammalian cells; 

replace the cassette for resistance to kanamycin by the pXL2650 and PGL2-Control (Promega Biotech.), which 

cassette for the expression of luciferase from pGL2-control. contain the same expression cassette, were used as control. 
The plasmid thus constructed was called pXL2650. In this 60 The cells used are NIH 3T3 mouse fibroblasts, inoculated on 

plasmid, the attP and attB sites flank the cassette for the the day before the experiment into 24-well culture plates in 

expression ofluciferase. Site-specific recombination enables the proportion of 50,000 cells per well. The plasmid is 

only the sequences required for the expression of luciferase diluted in 150 mM NaCl and mixed with the lipofectant 

together with the luciferase gene to be excised. This recom- RPR115335. However, it is possible to use various other 
bination may be carried out exactly as described in Example 65 commercial agents such as dioctadecylaminoglycylsper- 

2. A minicircle such as plasmid pXL2650 may be used mine (DOGS, Transfectam™, Promega) (Demeneix et al., 

thereafter in in vivo or in vitro transfection experiments. Int. J. Dev. Biol. 35 (1991) 481), Lipofectin™ (Gibco-BRL) 
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(Fegner et al., Proc. Natl. Acad. Sci. USA 84 (1987) 7413), 6-1.2. Insertion of a homopurine-homopyrimidine 

and the like. A positive charge of the lipofectant/negative sequence into plasmid pXL2649 

charge of the DNA ratio equal to or greater than 3 is used. a) Insertion of new restriction sites on each side of the 

The mixture is vortexed, left for ten minutes at room kanamycin cassette of pXL2649. 

temperature, diluted in medium without fetal calf serum, and 5 Plasmid pXL2649, as described in Example 2, was 

then added to the cells in the proportion of 0.5 mg of DNA digested with EcoRI so as to take out the kanamycin cassette 

per culture well. After two hours at 37° C, 10% by volume originating from plasmid pUC4KIXX (PharmaciaBiotech, 

of fetal calf serum is added and the cells are incubated for Uppsala, Sweden). For this purpose, 5 mg of plasmid 

48 hours at 37° C. in the presence of 5% CO^. The cells are pXL2649 were digested with EcoRI. The 4.2-kb fragment 

washed twice with PBS and the luciferase activity is mea- lO was separated by agarose gel electrophoresis and purified by 

sured according to the protocol described (Promega kit, electroelution. 

Promega Corp. Madison, Wis.), on a Lumat LB9501 lumi- I" addition, the plasmid pXL1571 was used. The latter 

nometer (EG and G Berthold, Evry). The transfection results was constructed from the plasmid pFRlO (Gene 25 (1983), 

corresponding to the conditions which have just been stated 71-88), into which the 1.6-kb fragment originating from 

are presented in FIG. 5. They show unambiguously that the 15 pUC4KIXX, corresponding to the kanamycin gene, was 

minicircle has the same transfection properties as plasmids inserted at the SstI site. This cloning enabled 12 new 

possessing an origin of replication. Thus these minicircles restriction sites to be inserted on each side of the kanamycin 

could be used in the same way as standard plasmids in gene g^n^- 

therapy applications. Five micrograms of pXL1571 were dialysed with EcoRI. 

20 The 1.6-kb firagment corresponding to the kanamycin gene 

EXAMPLE 6 was separated by agarose gel electrophoresis and purified by 

AflHnity Purification of a Minicircle Using a Triple-helix electroelution. It was then ligated with the 4.2-kb EcoRI 

Interaction. fragment of pXL2649. The recombinant clones were 

This example describes a method of purification of a selected after transformation into £. coftDHSa and selection 

minicircle according to the invention from a mixture con- 25 for resistance to kanamycin and to ampiciUin. The expected 

taining the plasmid form which has excised it, by triple-helix restriction profile was observed on one clone; this plasmid 

type interactions which will take place with a synthetic DNA clone was designated pXL2791. 

sequence carried by the minicircle to be purified. This b) Extraction of the kanamycin cassette from plasmid 

example demonstrates how the technology of purification by pXL2791 

triple-helix formation may be used to separate a minicircle 30 Plasmid pXL2791 was digested with SstI so as to take out 

from a plasmid form which has excised it. the kanamycin cassette. The 4.2-kb fragment was separated 

6-1. Obtaining a Minicircle Containing a Synthetic by agarose gel electrophoresis and purified with the Jetsorb 

Homopurine-homopyrimidine Sequence extraction gel kit (Genomed). It was then ligated. The 

6-1.1. Insertion of a homopurine-homopyrimidine recombinant clones were selected for resistance to ampicil- 

sequence into plasmid pXL2650 35 Hn after transformation into E. coli DH5a. The expected 

Plasmid pXL2650 possesses a unique BamHI site imme- restriction profile was observed on one clone. This plasmid 

diately after the cassette containing the luciferase gene of clone was designated pXL2792. This clone comprises, inter 

Photinus pyralis. This unique site was used to clone the alia. Sail and Xmal restriction sites between the attP and 

following two oligonucleotides: attB sites, c) Cloning of a homopurine-homopyrimidine 

4957 (SEQ ID No. 3) 40 sequence as well as of a cassette permitting the expression 
5'-GATCCGAAGAAGAAGAAGAAGAAGAAG of luciferase between the two attP and attB sites of plasmid 
AAGAAGAAGAAGAAGAAGAAGAAGAAGAAC- pXL2792 

3' Plasmid pXL2727 was used. This plasmid, digested with 

4958 (SEQ ID No. 4) Xmal and Sail, enables a fi-agment comprising the following 
5'-GATCGTTCTTCTTCTTCTTCTTCTTCTTCT 45 be taken out: the pCMV promoter, the luciferase gene of 
TCITCTTCTTCTrcrTCTTCTTCTrCG-3' Photinus pyralis, a polyadenylation site derived from SV40 

These oligonucleotides, when hybridized and cloned into and a homopurine-homopyrimidine sequence. The latter was 

plasmid pXL2650, introduce a homopurine- obtained after hybridization and cloning of the following 

homopyrimidine sequence (GAA)^, as described above. two oligonucleotides: 

To carry out this cloning, the oligonucleotides were first 50 6006: (SEQ ID No. 16) 

hybridized in the following manner. One ^g of each of these 5'-GATCTG AAGAAGAAGAAGAAGAAGAAG A 

two oligonucleotides were placed together in 40 ml of a final AG AAGAAG AAGAAGAAGAAGAAGAAGAAG 

buffer comprising 50 mM Tris-HCl, pH 7.4, 10 raM MgCl^. TGCAGATCT-3' 

This mixture was heated to 95° C. and was then placed at 6008: (SEQ ID No. 17) 

room temperature so that the temperature would fall slowly. 55 5'-GATCAGATCTGCAGTTCTTCTTCTTCTTCTT 
Ten ng of the mixture of hybridized oligonucleotides were CTTCTTCTTCTTCTTCT 
ligated with 200 ng of plasmid pXL2650 linearized with TCTTCTTCTTCTTCTTCA-3' 
BamHI, 30 ml of final. After ligation, an aliquot was The homopurine-homopyrimidine sequence present in 

transformed into DH5. The transformation mixtures were pXL2727 was sequenced by the Sequenase Version 2.0 

plated out on L medium supplemented with ampiciUin (50 60 method (United States Biochemical Corporation). The result 

mg/1). Twenty-four clones were digested with PflMl and obtained shows that the homopurine-homopyrimidine 

BamHI. One clone was found which had the size of the sequence actually present in plasmid pXL2727 contains 10 

950-bp PflMI-BamHI firagment increased by 50 bp. This repeats (GAA-CTT), and not 17 as the sequence of the 

clone was selected and designated pXL2651. oligonucleotides 6006 and 6008 suggested would be the 

Plasmid pXL2651 was purified according to the Wizard 65 case. The sequence actually present in plasmid pXL2727, 

Megaprep kit (Promega Corp., Madison, Wis.) according to read after sequencing on the strand corresponding to the 

the supplier's recommendations. oligonucleotide 6008, is as follows: 



6,143,530 



21 



22 



5'-GATCAGATCTGCAGTCTCTTCTTCTTCTT 
CTTCTTCTTCTTCT TCTTCTCTTCTCA-3' (SEQ 
ID No.18) 

One microgram of pXL2727 was digested with Xmal and 
Sail. The 3.7-kb fragment was separated by agarose gel 5 
electrophoresis and purified with the Jetsorb extraction gel 
kit (Genomed). In addition, 1.7 mg of pXL2792 were 
digested with Xmal and Sail. The 4.2-kb fragment was 
separated on agarose gel, purified with the Jetsorb extraction 
gel kit (Genomed) and ligated with the 3.7-kb Xmal-Sall 10 
fragment of pXL2727. The recombinant clones were 
selected after transformation into E. coli DH5a and selection 
for resistance to ampicillin. The expected restriction profile 
was observed on one clone; this clone was designated 
pXL2793. Plasmid pXL2793 was purified using a caesium is 
chloride density gradient according to a method already 
described (Maniatis et al., 1989). 

6-2. Preparation of the Column Enabling Triple-helix Type 
Interactions with a Homopurine-homopyrimidine Sequence 
Present in the Minicircle to be Effected 20 
The column was prepared in the following manner: 
The column used is a 1-ml HiTrap column activated with 
NHS (N-hydroxysuccinimide, Pharmacia), connected to a 
peristaltic pump (flow rate<l ml/min). The specific oligo- 
nucleotide used possesses an NH2 group at the 5' end. 25 
For plasmid pXL2651, its sequence is as follows: 
5'-GAGGCTTCTTCTTCTTCTTCTTCTT-3' (SEQ ID 
No.5) 

For plasmid pXL2793, its sequence is as follows (oligo 
116418): 30 

The buffers used are the following: 

Coupling buffer: 0.2 M NaHCOs, 0.5 M NaQ, pH 8.3. 

Washing buffer: 

Buffer A: 0.5 M ethanolamine, 0.5 M NaCl, pH 8.3. 

Buffer B: 0.1 M acetate, 0.5 M NaCl, pH 4. 

Fixing and elating buffer: 

Buffer F: 2 M NaQ, 0.2 M acetate, pH 4.5. 

Buffer E: 1 M Tris-HCl, pH 9, 0.5 raM EDTA. 40 

The column is prepared in the following manner: 

The column is washed with 6 ml of 1 mM HCl, and the 
oligonucleotide diluted in the coupling buffer (50 nmol in 1 
ml) is then applied to the column and left for 30 minutes at 
room temperature. The column is washed with 3 ml of 45 
coupling buffer, then with 6 ml of buffer A, followed by 6 ml 
of buffer B. The latter two buffers are applied three times in 
succession to the column. In this way, the oligonucleotide is 
linked covalently to the column via a CONH link. The 
column is stored at 4° C. in PBS, 0.1% NaNj. so 
6-3. Purification of a Minicircle Containing a Synthetic 
Homopurine-homopyrimidine Sequence, by a Triple-helix 
Type Interaction 

6-3.1. Purification of plasmid pXL2651 

Plasmid pXL2651 was introduced into the strain ss 
D1210HP. This recombinant strain [D1210HP (pXL2651)] 
was cultured as described in Example 3 so as to generate the 
minicircle containing the luciferase gene of Photinus pyra- 
lis. Twenty ml of culture were removed and centrifuged. The 
cell pellet is taken up in 1.5 ml of 50 mM glucose, 25 mM 60 
Tris-HCl, pH 8, 10 mM EDTA. Lysis is carried out with 2 
ml of 0.2 M NaOH, 1% SDS, and neutralization with 1.5 ml 
of 3 M potassium acetate, pH 5. The DNA is then precipi- 
tated with 3 ml of 2-propranol, and the pellet is taken up in 
0.5 ml of 0.2 M sodium acetate, pH 5, 0.1 M NaCl and 65 
loaded onto an oligonucleotide column capable of forming 
triple-helix type interactions with poly(GAA) sequences 



contained in the minicircle, as described above. After the 
column has been washed beforehand with 6 ml of buffer F, 
the solution containing the minicircle to be purified is 
incubated, after being applied to the column, for two hours 
at room temperature. The column is washed with 10 ml of 
buffer F and elution is then carried out with buffer E. 

Purified DNA corresponding to the minicircle is thereby 
obtained. The minicircle obtained, analysed by agarose gel 
electrophoresis and ethidium bromide staining, takes the 
form of a single band of supercoiled circular DNA. Less 
than 5% of starting plasmid pXL2651 is left in the prepa- 

6-3.2. Purification of plasmid pXL2793 

The 7.9-kb plasmid pXL2793 was introduced into the 
strain D1210HP. This recombinant strain was cultured as 
described in Example 3, so as to generate the 4-kb minicircle 
containing the luciferase gene of Photinus pyralis and a 
3.9-kb plasmid. Two hundred ml of culture were removed 
and centrifuged. The cell pellet was treated with the Wizard 
Megaprep kit (Promega Corp., Madison, Wis.) according to 
the supplier's recommendations. The DNA was taken up in 
a final volume of 2 ml of 1 mM Tris, 1 mM EDTA, pH 8. 
Two hundred and fifty microliters of this plasmid sample 
were diluted with buffer F in a final volume of 2.5 ml. The 
column was washed beforehand with 6 ml of buffer F. The 
whole of the diluted sample was loaded onto an oligonucle- 
otide column capable of forming triple-helix type interac- 
tions with poly(GAA) sequences contained in the minicircle, 
prepared as described above. After washing with 10 ml of 
buffer F, elution is carried out with buffer E. The eluted 
sample is recovered in 1-ml fractions. 

By this method, purified DNA corresponding to the 
minicircle generated from pXL2793 is obtained. The DNA 
sample eluted from the column was analysed by agarose gel 
electrophoresis and ethidium bromide staining, and by 
enzyme restriction. For this purpose, the eluted fractions 
which were shown to contain DNA by assay at ODjgo nm 
were dialysed for 24 hours against 1 mM Tris, 1 mM EDTA, 
then precipitated with isopropanol and taken up in 200 ml of 
H2O. Fifteen microhters of the sample thereby obtained 
were digested with Sail, this restriction site being present in 
the minicircle and not in the 3.9-kb plasmid generated by the 
recombination, or with XmnI, this restriction site being 
present in the 3.9-kb plasmid generated by the recombina- 
tion and not in the minicircle. The result obtained is pre- 
sented in FIG. 7, showing that the minicircle has been 
purified of the recombinant plasmid. 

EXAMPLE 7 

In vivo Transfection of Mammalian Cells with a Minicircle 
This example describes the transfer of a minicircle coding 
for the luciferase gene into the brain of newborn mice. The 
minicircle (30 ^ug) is diluted in sterile 150 mM NaCl to a 
concentration of 1 lAglfil. A synthetic transfectant such as 
dioctadecylamidoglycylspermine (DOGS) is then added in a 
positive/negative charge ratio less than or equal to 2. The 
mixture is vortexed, and 2 /.ig of DNA are injected into the 
cerebral cortex of anaesthetized newborn mice using a 
micromanipulator and a microsyringe. The brains are 
removed 48 hours later, homogenized and centrifuged and 
the supernatant is used for the assay of luciferase by the 
protocols described (such as the Promega kit). 

EXAMPLE 8 

Use of the par Locus of RK2 to Reduce the Presence of 
Minicircle or Miniplasmid Topoisomers 

This example demonstrates the presence of topological 
forms derived i) from the plasmid possessing the attP and 
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attB sequences in the direct orientation, ii) from the perature is 50° C. The PGR product digested at the Xbal and 

minicircle or iii) from the miniplasmid, after the action of Hindlll sites was cloned into the phage MlSmpEH between 

the integrase of bacteriophage 1 inf. coZi. This example also the Xbal and Hindlll sites. The amplified sequence is 

shows that these topological or oligomeric forms may be identical to the attP sequence described in Lambda II (edited 

resolved by using the par locus of RK2 (Gerlitz et al., 1990 5 by R. W. Hendrix, J. W. Roberts, F. W. Stahl, R. A. Weisberg; 

J. Bacteriol. 172 p. 6194). In effect, this locus contains, in Cold Spring Harbor Laboratory 1983) between positions 

particular, the parA gene coding for a resolvase acting at the 27480 and 27863. 

mrs (multimer resolution system) site (Eberl et al, 1994 8-1.3. Plasmid pXL2777 

Mol. Microbiol. 12 p. 131). Plasmid pXL2777 (6.9 kb) possesses the minimal repli- 

8-1. Construction of Plasmids pXL2777 and pXL2960 lo con of ColEl originating from pBluescript, the gene coding 

Plasmids pXL2777 and pXL2960 are derived from the for resistance to kanamycin, the attP and attB sequences of 

vector pXL2776, and possess in common the minimal bacteriophage 1 in the direct orientation and separated by the 

replicon of ColEl, the gene of the transposon Tn5 coding for sacB gene coding for levansucrase of B. subtilis (P. Gay et 

resistance to kanamycin and the attP and attB sequences of al., 1983 J. Bacteriol. 153 p. 1424), and the Sp omegon 

bacteriophage 1 in the direct orientation. These plasmids 15 coding for the gene for resistance to spectinomycin Sp and 

differ in respect of the genes inserted between the attP and streptomycin Sm (P. Prentki et al., 1984 Gene 29 p. 303). 

attB sequences, in particular pXL2777 contains the omegon The sacB-Sp cassette having EcoRV and Nsil cloning ends 

cassette (coding for the gene for resistance to comes from the plasmid pXL2757 (FR95/01632) and was 

spectinomycin) whereas plasmid pXL2960 carries par locus cloned between the EcoRV and Nsil sites of pXL2776 to 

ofRK2. 20 formpXL2777. 

8-1.1. Minimal vector pXL2658 8-1.4. Plasmid pXL2960 

The vector pXL2658 (2.513 kb) possesses the minimal Plasmid pXL2960 (7.3 kb) possesses the minimal repH- 

replicon of ColEl originating from pBluescript (ori) and the con of ColEl originating from pBluescript, the gene coding 

gene of the transposon Tn5 coding for resistance to kana- for resistance to kanamycin and the attP and attB sequences 

mycin (Km) as selectable marker. After the Bsal end has 25 of bacteriophage 1 in the direct orientation and separated by 

been blunted by the action of the Klenow enzyme, the i) the sacB gene coding for levansucrase of B. subtilis (P. 

1.15-kb Bsal-Pvull fragment of pBKS+ (obtained from Gay et al., 1983 J. Bacteriol. 153 p. 1424) and ii) the par 

Stratagene) was cloned with the 1.2-kb Smal fragment of locus of RK2 (Gerlitz et al., 1990 J. Bacteriol. 172 p. 6194). 

pUC4KlXX (obtained from Pharmacia) to generate the The par cassette having BamHI ends comes from the plas- 

plasmidpXL2647. The oligo-nucleotides 5542 5'(AGCTrC 30 mid pXL2433 (PCT/FR 95/01 178) and was introduced 

TCG AGC TGC AGG ATA TCG AAT TCG GAT CCT CTA between the BamHI sites of pXL2777 to generate pXL2960. 

GAG CGG CCG CGA GCf CC)3' (SEQ ID No.20) and 8-2. Resolution of Minicircle or Miniplasmid Topoisomers 

5543 5'(AGC TGG AGC TCG CGG CCG CTC TAG AGG Plasmids pXL2777 and pXL2960 were introduced by 

ATC CGA ATT CGA TAT CCT GCA GCT CGA GA)3' transformation into E. coli strain D1210HP. The transfor- 

(SEQ ID No.21) were hybridized with one another and then 35 mants were selected and analysed as described in Example 

cloned at the Hindlll site of pXL2647; in this way pXL2658 2, with the following modifications: the expression of the 

is constructed. In this plasmid, the multiple cloning site is integrase was induced at 42° C. for 15 min when the optical 

SstI, NotI, Xbal, BamHI, EcoRI, EcoRV, PstI, Xhol and density of the cells at 610 nm is 1.8, and the cells are then 

Hindlll between the origin of replication and the gene incubated at 30° C. for 30 min, see FIG. 9, or for a period 

coding for resistance to kanamycin. 40 varying from 2 minutes to 14 hours (0/N), see FIG. 10. The 

8-1.2. Vector pXL2776 containing the attP and attB plasmid DNA originating from uninduced and induced cul- 

sequences of phage 1 tures was then analysed on agarose gel before or after 

The vector pXL2776 (2.93 kb) possesses the minimal digestion with a restriction enzyme exclusive to the 

replicon of ColEl originating from pBluescript, the gene minicircle portion (EcoRI) or miniplasmid portion (Bglll), 

coding for resistance to kanamycin and the attP and attB 45 see Figure Y, or after the action of DNA topoisomerase A or 

sequences of bacteriophage 1 in the direct orientation, see the gyrase of E. coli. The supercoiled dimer forms of 

FIG. 8. The 29-bp attB sequence (Mizuuchi et al., 1980 Proc. minicircle or miniplasmid are clearly revealed by i) their 

Natl. Acad. Sci. USA 77 p. 3220) was introduced between molecular weight, ii) their linearization by the restriction 

the Sad and Hindlll restriction sites of pXL2658 after the enzyme, iii) their change in topology through the action of 

sense oligonucleotide 6194 5'(ACT AGT GGC CAT GCA 50 topoisomerase A (relaxed dimer) or of the gyrase 

TCC GCT CAAGTTAGT ATAAAAAAG CAG GCTTCA (supersupercoiled dimer), iv) specific hybridization with an 

G)3' (SEQ ID No.22) has been hybridized with the antisense internal fragment pecuUar to the minicircle or the miniplas- 

oligonucleotide 6195 5'(AGC TCT GAA GCC TGC TTT mid. Other topological forms of higher molecular weights 

TTTATACTAACTTGAGCG GAT GCA TGG CCA CTA than that of the initial plasmid originate from the initial 

GTA GCT)3' (SEQ ID No.23) in such a way that the Sad 55 plasmid or the minicircle or the miniplasmid, since they 

and Hindi!! sites are no longer re-formed after cloning. This disappear after digestion with the restriction enzyme exclu- 

plasmid, the sequence of which was verified with respect to sive to the minicircle portion (EcoRI) or miniplasmid por- 

attB, is then digested with Spel and Nsil in order to tion (Bglll). These forms are much less abundant with 

introduce in it the attP sequence flanked by the Nsil and pXL2960 than with pXL2777 as initial plasmid, see FIG. 10. 

Xbal restriction sites and thus to generate plasmid pXL2776. 60 In particular, the dimer form of minicircle is present to a not 

The attP sequence was obtained by PCR amplification using insignificant extent with plasmid pXL2777, whereas it is 

plasmid pXL2649 (described in Example 2) as template, the invisible with plasmid pXL2960 when the cells are incu- 

sense oligonucleotide 6190 5'(GCG TCT AGA ACA GTA bated for at least 30 min at 30° C, see FIGS. 9 and 10. It 

TCG TGA TGA CAG AG)3' (SEQ ID No.24) and the should be noted that minicircle dimers are observed at the 

antisense oligonucleotide 6191 5'(GCC AAG CIT AGC 65 beginning of the kinetic experiment with pXL2960 (2 to 10 

TTT GCA CTG GAT TGC GA)3' (SEQ ID No.25), and min), and are thereafter resolved (after 30 min), see FIG. 10. 

performing 30 cycles during which the hybridization tem- Consequently, the par locus leads to a significant reduction 
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in the oligomeric/topological forms resulting from the action 
of the integrase of bacteriophage 1 in E. coli on plasmids 
containing the attP and attB sequences in the direct orien- 
tation. 

iDENrnFiCAnoN of the nucleotide 

SEQUENCES 



SEQ ID No.l: oligonucleotide 5476: 
5'-AATTGTGAAGCCTGCTTTTTTATACTAA 
CTTGAGCGG-3' 

SEQ ID No. 2: oligonucleotide 5477 
5'-AATTCCGCTCAAGTTAGTATAAAAAAGC 
AGGCTTCAC-3' 

SEQ ID No. 3: oligonucleotide 4957: 15 
5'-GATCCGAAGAGAGAGAAGAAGAAGAA 
GAAGAAGAAGAAGAAGAAGAAGAAGAAG 
AAC-3' 

SEQ ID No. 4: oligonucleotide 4958: 
5'-GATCGTTCTTCTTCTTCTTCTTCTTCTTCTT 2° 
CTTCTTCTTCTTCTTCTTCTTCTTCG-3' 

SEQ ID No.5: oligonucleotide poly-CTT: 
5'-GAGGCTTCTrCTTCTTCTTCTTCTT-3' 
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SEQ ID No. 17: oligonucleotide 6008: 
5'-GATCAGATCTGCAGTTCTTCTTCTTCTTCT 
TCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTC 

TTCA-3' 

SEQ ID No.18: (Sequence present in plasmid pXL2727 
corresponding to the oligonucleotide 6008): 
5'-GATCAGATCTGCAGTCTCTTCTTCTTCTTC 
TTCrTCTTCTTCTTCTTCTCTTCTTCA-3' 
SEQ ID No. 19: (oligonucleotide 116418): 



SEQ ID No. 7: (attP sequence of phage lambda): 

5'-CAGCTnTTTATACTAAGTTG-3' 
SEQ ID No.8: (attB sequence of phage P22): 

S'-CAGCGCAFTCGTAATGCGAAG-S' 



SEQ ID No. 11: (attP sequence of phage F80): 

5'-AACACTTTCTTAAATTGTC-3' 
SEQ ID No. 12: (attB sequence of phage HPl): 

5'-AAGGGArTTAAAATCCCTC-3' 
SEQ ID No. 13: (attP sequence of phage HPl): 

5'-ATGGTATTTAAAATCCCTC-3' 
SEQ ID No.l4: (att sequence of plasmid pSAM2): 

S'-TTCTCTGTCGGGGTGGCGGGATTTGAAC 

CCACGACCTCTTCGTCCCGAA-3' 



5'-CTTCTTCTTCTTCTTCTTCTT-3' 

SEQ ID No. 20: (oligonucleotide 5542): 
5'-AGCTTCTCGAGCTGCAGGATATCGAATTC 
GGATCCTCTAGAGCGGCCGCGAGCTCC-3' 

SEQ ID No. 21: (oligonucleotide 5543): 
5'-AGCTGGAGCTCGCGGCCGCTCTAGAGGA 
TCCGAArTCGATATCCTGCAGCTCGAGA-3' 

SEQ ID No. 22: sense oligonucleotide 6194: 
5'-ACTAGTGGCCATGCATCCGCTCAAGTTAG 
TArAAAAAAGCAGGCTTCAG-3' 

SEQ ID No. 23: antisense oligonucleotide 6195: 
5'-AGCTCTGAAGCCTGCTTTTTTATACTAACT 
TGAGCGGATGCAIGGCCACTAGTAGCT-S' 

SEQ ID No. 24: sense oligonucleotide 6190: 
5'-GCGTCTAGAACAGTATCGTGATGACAGAG-3' 

SEQ ID No.25: antisense oligonucleotide 6191: 
5'-GCCAAGCTTAGCTTTGCACTGGAnGCGA-3' 
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SEQUEHCE LISTING 



) NUMBER OF SEQTIEMCES: 25 
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(xi) SEQUEHCE DESCRIPTION: SEQ ID N0:1 
AATTGTGAAG CCTGCTTTTT TATACTfiACT 1 



(2) INFORMATION F 



) SEQUENCE C 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 



esc = "Oligonucleotide" 

(xi) SEQUENCE D 
AATTCCGCTC AAGTTAGTAT A 



SEQUENCE 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: doub 

(D) TOPOLOGY: linear 



) MOLECULE TYPE: 



SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AGAAGAAGAA 6AAGAA6AA6 AA6AAGAAGA AGAAGAAGAA GAAGAAC 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 
GATCGTTCTT CTTCTTCTTC T 

H SEQ ID H0:5: 



) TOPOLOGY: linear 



(xi) SEQUEHCE DESCRIPTION: SEQ ID N0:5: 
GAGGCTTCTT CTTCTTCTTC TTCTT 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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-continued 



) SEQUENCE DESCRIPTION: SEQ I 



(2) INFORMATION FOR SEQ ID NO: 7: 

<i) SEQUENCE CHftRACTERISTICS : 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii: 

(xi) SEQUENCE DESCRIPTION: SEQ ID H0:7: 
CAGCTTTTTT ATACTAAGTT G 



m FOR SEQ ID N0:8: 
(i) SEQUENCE CHARACTERISTICS: 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 



(2) INFORMATION FOR SEQ ID NO: 9: 



<C) STRAHDEDNESS: double 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 



I SEQUENCE C 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRAHDEDNESS: double 

(D) TOPOLOGY: linear 



I SEQUENCE DESCRIPTION: SEQ I 



) INFORMATION FOR SEQ I 



(C) STRANDEDNESS: 
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(D) TOPOLOGY I linear 



(xi) SEQUENCE DESCRIPTION: SEQ I 
r TAAftTTGTC 



(2) INFORMATION P( 
(i) SEQUENCE 



(C) STRANDEDNESS: double 



(xi) SEQUENCE DESCRIPTION: SEQ I 
AAGGGATTTA AA&TCCCTC 

(2) INFORMATION FOR SEQ ID N0:13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
ATGGTATTTA AAATCCCTC 



R SEQ ID HO: 14: 

) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

eic acid 

"Oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:14: 
TTCTCTGTCG GGGTGGCGGG ATTTGAACCC ACGACCTCTT CGTCCCGAA 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



"Oligonucleotide" 
(xi) SEQUENCE DESCRIPTION: SEQ ID H0:15: 
CGTCGAAATA TTATAAATTA TCAGACA 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 base pairs 

(B) TYPE: nucleic acid 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:16: 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



^ GAAGAACTGC 



GATCAGATCT GCAGTTCTTC TTCTTCTTCT TCTTCTTCTT CTTCTTCTTC TTCTTCTTCT 
TCTTCA 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH! 57 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:18: 
GATCAGATCT GCAGTCTCTT CTTCTTCTTC TTCTTCTTCT TCTTCTTCTC TTCTTCA 



^ SEQ ID NO: 19: 
(i) SEQUENCE C 



CTTCTTCTTC TTCTTCTTCT T 



(2) INFORMATION F 



(C) STRANDEDNESS: double 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20i 
AGCTTCTCGA GCTGCAGGAT ATCGAATTCG GATCCTCTAG 
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-continued 



(2) IHFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID H0:21: 

AGCTGGAGCT CGCGGCCGCT CTAGAGGATC CGAATTCGAT ATCCTGCAGC TCGAGA 56 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 49 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

ACTAGTGGCC ATGCATCCGC TCAAGTTAGT ATAAAAAAGC AGGCTTCAG 49 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

AGCTCTGAAG CCTGCTTTTT TATACTAACT TGAGCGGATG CATGGCCACT AGTAGCT 57 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

GCGTCTAGAA CAGTATCGTG ATGACAGAG 29 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 29 base pairs 

(C) STRANDEDNESS: double 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Oligonucleotide" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
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What is claimed is: 

1. A double-stranded DNA molecule, comprising an 
expression cassette containing a gene of interest under 
control of a transcription promoter and a transcription ter- 
minator active in a mammalian cell, wherein said molecule: 5 

is in circular and supercoiled form, 
lacks an origin of replication, 
lacks a marker gene, and 

comprises a region resulting from site-specific recombi- 
nation between two sequences, said region being 10 
located outside the expression cassette. 

2. The molecule according to claim 1, further comprising 
a sequence which interacts specifically with an oligonucle- 
otide to form a triple helix by hybridization. 

3. The molecule according to claim 2, wherein the 
sequence which forms a triple helix comprises from 5 to 30 
base pairs. 

4. The molecule according to claim 2, wherein the 
sequence which forms a triple helix is a homopurine- 
homopyrimidine sequence. 

5. The molecule according to claim 1, wherein said region 
results from site-specific recombination between two att 
attachment sequences, two recognition sequences of a 
resolvase of a transposon, or two mrs sequences of plasmid 
RK2. 

6. The molecule according to claim 1, further comprising 25 
an mrs sequence originating from a par locus of RK2. 

7. The molecule according to claim 1, wherein the gene of 
interest is a nucleic acid coding for a therapeutic, vaccine, 
agricultural, or veterinary product. 

8. The molecule according to claim 1, wherein said 30 
molecule is obtained by excision from a plasmid or chro- 
mosome by site-specific recombination. 

9. A recombinant DNA comprising a polynucleotide com- 
prising an expression cassette positioned between two 
sequences positioned in direct orientation, which recombine 35 
by site-specific recombination in the presence of a 
recombinase, wherein said expression cassette comprises a 
gene of interest under control of a transcription promoter 
and a transcription terminator active in a mammalian cell, 
and wherein said polynucleotide lacks an origin of replica- 40 
tion and a marker gene. 

10. The recombinant DNA according to claim 9 further 
comprising an origin of replication and, optionally, a marker 
gene, wherein the origin of replication and optional marker 
gene are located outside said polynucleotide. 45 

11. The recombinant DNA according to claim 9, wherein 
the recombinase is a recombinase of an integrase family of 
phage lambda or of a resolvase family of transposon Tn3. 

12. The recombinant DNA according to claim 9, wherein 
the two sequences which recombine by site-specific recom- 50 
bination are derived from a bacteriophage. 

13. The recombinant DNA according to claim 12, wherein 
the two sequences which recombine by site-specific recom- 
bination consist of att attachment sequences of a bacterioph- 
age or sequences derived therefrom. 55 

14. The recombinant DNA according to claim 13, wherein 
the two sequences which recombine by site-specific recom- 
bination consist of attachment sequences of bacteriophage 
lambda, P22, <I>080, PI, or HPl, of plasmid pSAM2, or of 
sequences derived therefrom. 60 

15. The recombinant DNA according to claim 14, wherein 
the sequences which recombine by site-specific recombina- 
tion comprise all or part of SEQ ID Nos. 1, 2, 6, 7, 8, 9, 10, 
11, 12, 13, or 14. 

16. The recombinant DNA according to claim 12, wherein 65 
the two sequences which recombine by site-specific recom- 
bination are derived firom bacteriophage PI. ' 
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17. A plasmid comprising: 

a) a bacterial origin of replication and optionally, a marker 
gene; and 

b) a polynucleotide comprising an expression cassette 
positioned between attP and attB sequences of a 
teriophage lambda, P22, $80, PI, or HPl, or of plas- 
mid pSAM2, positioned in direct orientation, which 
recombine by site-specific recombination in the pre; 
ence of a recombinase, wherein said expression cai 
sette comprises a gene of interest under control of 
transcription promoter and a transcription terminatoi 
active in a mammalian cell, and wherein said poly- 
nucleotide lacks an origin of replication and a market 



18. The plasmid according to claim 17, wherein the attP 
and attB sequences which recombine by site-specific recom- 
bination are attachment sequences of bacteriophage lambda. 

19. A plasmid comprising: 

a) a bacterial origin of replication and optionally, a marker 
gene; and 

b) a polynucleotide comprising an expression cassette 
positioned between two inverted repeat sequences of 
bacteriophage PI (loxP region) positioned in direct 
orientation, which recombine by site specific recombi- 
nation in the presence of a recombinase; wherein the 
expression cassette comprises a gene of interest under 
control of a transcription promoter and a transcription 
terminator active in a mammalian cell, and wherein the 
polynucleotide lacks an origin of replication and a 
marker gene. 

20. The recombinant DNA according to claim 9, wherein 
the two sequences which recombine by site-specific recom- 
bination are derived from a transposon. 

21. The recombinant DNA according to claim 20, wherein 
the two sequences which recombine by site-specific recom- 
bination consist of recognition sequences of a resolvase of a 
transposon Tn3, Tn21, or Tn522, or sequences derived 
therefrom. 

22. The recombinant DNA according to claim 21, wherein 
the two sequences which recombine by site-specific recom- 
bination comprise all or part of sequence SEQ ID No.l5. 

23. The recombinant DNA according to claim 9, wherein 
the two sequences which recombine by site-specific recom- 
bination are derived from a par region of plasmid RP4. 

24. The recombinant DNA according to claim 9, wherein 
the expression cassette further comprises a sequence which 
interacts specifically with an oligonucleotide to form a triple 
helix by hybridization. 

25. A plasmid comprising: 

a) an origin of replication and optionally, a marker gene; 
and 

b) a polynucleotide comprising at least one gene of 
interest and a sequence which interacts specifically 
vsdth an oligonucleotide to form a triple helix by 
hybridization, wherein the at least one gene of interest 
and the oligonucleotide interacting sequence are posi- 
tioned between two sequences positioned in direct 
orientation, which recombine by site-specific recombi- 
nation in the presence of a recombinase, and wherein 
the polynucleotide lacks an origin of replication and a 
marker gene. 

26. A plasmid comprising: 

a) an origin of replication and optionally, a marker gene; 

b) a polynucleotide comprising at least one 
interest and an mrs sequence originating fi 



gene of 
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locus of plasmid RK2, wherein the at least one gene of 
interest and the mrs sequence are positioned between 
two sequences positioned in direct orientation, which 
recombine by site-specific recombination in the pres- 
ence of a recombinase, and wherein the polynucleotide 
lacks an origin of replication and a marker gene. 

27. The plasmid according to claim 26, wherein the 
polynucleotide further comprises a sequence which interacts 
specifically with an oligonucleotide to form a triple helix by 
hybridization, wherein the oligonucleotide interacting 
sequence is placed between the two sequences positioned in 
direct orientation, which recombine by site-specific recom- 
bination. 

28. A plasmid comprising: 

a) an origin of replication and optionally, a marker gene; 

b) a polynucleotide comprising: 

1) a first set of two sequences positioned in direct 
orientation, which recombine by integrase- 
dependent site-specific recombination; 

2) a second set of two sequences positioned in direct 20 
orientation, which recombine by resolvase- 
dependent site-specific recombination; 

3) at least one gene of interest; and, 

4) optionally, a sequence which interacts specifically 
with an oligonucleotide to form a triple helix by 25 
hybridization, 

wherein each integrase-dependent sequence of 1) is posi- 
tioned next to a resolvase-dependent sequence of 2) and 
wherein the at least one gene of 3) and the optional oligo- 
nucleotide interacting sequence of 4) are placed between the jq 
integrase-dependent/resolvase-dependent sequences, and 
wherein the polynucleotide lacks an origin of replication and 
a marker gene. 

29. A cultured recombinant cell comprising one or more 
copies of the recombinant DNA according to claim 9 
inserted into its genome. 

30. A cultured recombinant cell comprising the recombi- 
nant DNA according to claim 10. 

31. The cultured recombinant cell according to claim 30, 
wherein said cell is a bacterium. 

32. The cultured recombinant cell according to claim 30, 
wherein said cell is a eukaryotic cell. 

33. The cultured recombinant cell according to claim 31, 
wherein the bacterium is Escherichia coli D1210HP with 
accession number 1-2314. 

34. A method for preparation of the DNA molecule ''^ 
according to claim 1, comprising culturing 1) a host cell 
comprising a recombinant DNA comprising a nucleic acid 
consisting of an expression cassette positioned between two 
sequences positioned in direct orientation, which recombine 
by site-specific recombination in the presence of a ^° 
recombinase, and wherein the expression cassette comprises 
a gene of interest under control of a transcription promoter 
and a transcription terminator active in a mammalian cell 
with 2) a recombinase, whereby site-specific recombination 
occurs between the two sequences positioned in direct 
orientation. 



35. The method according to claim 34, wherein said 
expression cassette is positioned between two bacteriophage 
sequences, which are positioned in direct orientation and 
recombine by site-specific recombination. 

36. The method according to claim 34, wherein the 
cultured host cell is brought into contact with the recombi- 
nase by transfecting or infecting the cultured host cell with 
a plasmid or a phage containing a gene for the recombinase. 

10 37. The method according to claim 34, wherein the 
cultured host cell is brought into contact with the recombi- 
nase by inducing expression of a gene coding for the 
recombinase, wherein the gene is present in the host cell. 

38. The method according to claim 37, wherein the host 
15 cell comprises within its genome a recombinase gene having 

temperature-regulated expression, and wherein the cultured 
host cell is brought into contact with the recombinase by 
culturing the host cell at an induction temperature of the 
recombinase gene, whereby expression of the recombinase 
^° gene is induced. 

39. The method according to claim 38, wherein the host 
cell comprises a lysogenic phage integrated in its genome 
and wherein the lysogenic phage comprises the gene for the 
recombinase. 

40. A method for preparation of the DNA molecule 
according to claim 1, comprising combining: 

a) a rephcative plasmid comprising: 

1) an origin of replication and optionally, a marker 
gene; and 

2) a polynucleotide comprising an expression cassette 
positioned between two sequences positioned in 
direct orientation, which recombine by site-specific 
recombination in the presence of a recombinase, 
wherein the expression cassette comprises a gene of 
interest under control of a transcription promoter and 
a transcription terminator active in a mammalian 
cell, and wherein the polynucleotide lacks an origin 
of replication and a marker gene; and 

b) a recombinase, whereby site-specific recombination 
occurs between the two sequences of 2) positioned in 
direct orientation. 

41. The method according to claim 34, further comprising 
purifying a minicircle formed by said site-specific recom- 
bination. 

42. The method according to claim 41, wherein the 
minicircle is purified by contacting the minicircle with a 
specific oligonucleotide that is grafted onto a support, 
whereby a triple helix is formed by hybridization of said 
specific oligonucleotide with a specific sequence present in 
the minicircle. 

43. The recombinant DNA according to claim 9, wherein 
the two sequences which recombine by site^specific recom- 
bination are from a 2/1 plasmid. 
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Phage X integrative and exdsive recombination nomially proceeds by a 
pair <5f sequential strand exchanges. During the first exchange reaction, 
the "top" strand in each recombination site is cleaved, exchanged, and 
religated generating a Holliday junction intermediate. This intennediate 
DNA structure is resolved through a pair of redprocal "bottom" strand 
exchanges, leading to recombinant products. The strict co-ordination of 
exchange reactions ensures religation betwreen correct partner strands 
only. Here we shovr that the directionality of recombination is altered 
in vivo by two mutant integrases, Int-h (E174 K) and a double mutant 
Int-h/218 (E174 K/E218 K). This change in directionality leads to del- 
etion instead of inversion on substrates that carry inverted attachment 
sites and, depending on the pair of target sites employed, requires the 
presence or absence of integration host factor. Neither Fis nor Xis is 
involved in deletion. Sequence analyses of deletion products reveal that 
the newly generated hybrid attachment site exhibits a reversed genetic 
polarity. We demonstrate that only one of two possible hybrid site con- 
figurations is generated and discviss two pathways leading to deletion. In 
the first, deletion results from a wrong alignment of the two recombina- 
tion sites vdthin the synaptic complex. In the second pathway, the unco- 
ordinated cleavage by the mutant integrases of all four DNA strands pre- 
sait in a conventional Holliday junction intermediate leads to two 
double^tranded breaks, whereby the subsequent i^oining between 
"wrong" partner strands appears restricted to only two strands. • ... 

© 1999 Acadeimc Press 
Keywords: site-spedfic recombinatior); X integras^ strand exchange; - 
integration host factor; HoUiday jimction isomerization 



introduction 



The phage X-encoded integrase protein (Int) is 
the prototype of the soolled integrase family 
whidi catatyzes cortservative site-spedfic recombi- 
itation between two DNA target sites. Int executes 
both the integration and exdsion of the phage into 
and out of the Escherichia coli genome, respectivdy. 
The structure of the catalytic domain of Int has 
recently been solved (Kwon et al., 1997). The Int 
system represents, therefore, one of the best under- 
stood recrahbination systems (for reviews, see 
Landy, 1989, 1993; Sadowski, 1993; Nash, 1996; 
Hallet & Sheixatt, 1997; Yang & Mizuudii, 1997). 



Abbreviatioiis used: Int, phage-encoded integrase; 
: IHF, integration host factor; Fis, fector for inveraon 
stirnulatkm; phage-encoded exdsionase. 

&ntail address of the correspondii^ autiKon 
p.droege@uni-koeln.de 



Integrative and exdsive recombination occurs 
between pairs erf attachment sites, termed attF/atfB 
and oUL/attR, r^^ectively. Each att sequence is 
composed of two core Int binding sites separated 
by a seven base-pair overlap region. The overlap 
sequence is identical in all wild-type att sites, and 
identity is a prerequisite for effident recombina- 
tion. In addition to core sites where strand dea- 
vage and religation occins, eadi ^te except aUB 
contains additional bit biitding sites^so^aUed arm 
sites. A varying number of flariiking recognition 
sequences ior the acGessoi:)i>>^4A-bending proteins 
integration host factor (IHF), factor for inversion 
stimulation (Fis), or the phage-encoded Xis protein 
are also present in the flanking regions, again with 
the ecc^tion of attB. Bnt is a hetetobivalerit DNA- 
binding protein and, widi assistarice froih the 
accessoiy proteins, is able to txmd simtdtafieou^ly 
to core and arm sites within the same aU ^te XMoi- 
toso de Vargas et al, 1988, 1989). Deperuling on 
both file presence and number of accessory Actors, 
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strand Exchange During X Recombination 



the resulting nucleoprotein structures at affP, atfL, 
and aliR ejdiibit different architectures, and it is 
this difference that controls for the directionality of 
the reaction, i.e. integration or excision (Moitoso de 
Vargas & Landy, 1991). 

In the first step leading to integrative recombina- 
tion, a specialized nucleoprotein structure, the inta- 
isome, is formed between Int, IHF, and supercoiled 
attF (Better et al, 1982; Richet et al, 1986). The 
second step involves the pairing of the intasome 
with protein-free attB, the latter consists only of 
two core sites and the overlap region. Hence, Int 
monomers which cataljrtically act upon ottB are in 
this case exclusively provided from «ttP (Richet 
et al., 1988; Patsqr & Bruist, 1995). There is no 
DNA topological constraint imposed on synapsis 
between atfP and attB, which explains why both 
direct and inverted pairs of att sites present on the 
same DNA molecule are efficiently recombined 
in vitro and in vivo, leading to deletion or inversion 
of the intervening DNA segment, respectively. 

In the third step, Int catalyzes a reciprocal 
exdiange of the "top" strands at the left boundary 
of the overlap region, which results in a Holliday 
jiiiKtion lecombination intermediate (Figure 1(a) 
and (b)). This intermediate DNA structure is 



(a) (b) 




Figure 1. Conventional strand ecdiange duxii^ X inte- 
grative recombination, (a) attB (genetic pdaiity BOB") 
and attT (POP) are aligned in antiparaUel orieAtation. 
The Int monomer (filled oval) bound to dfher fl»e B arm 
(marked B) or P ann (marked P) initiates a nudeophilic 
attack (filled arrows) against the top strand within each 
att site, (b) Rec^rocal strand exchanges between top 
strands has been completed, leading to a Holliday junc- 
tion intermediate structure, (c) The Int monomers boimd 
to the B' and F arm have deaved the bottom strands at 
the right ends of both overlap regions and are thus 
covalently linked to the DNA. (d) Rec^rocal bottom 
strand ecdiange has been completed lading to attL 
(genetic polarity BOF) and attR (FOBO, which are flie 
natural targets for excMve FecombinatiDn. 



resolved by exchange of the "bottom" strands at 
the right boundary of the overlap region 
(Figure 1(b) to (d)). Thus, Int executes an ordered, 
sequential pair of strand exchanges, i.e. cleavage, . 
exdiange, and rejoining of one pair of recombina- 
tion partner strands is completed before initiating 
the same reactions on the otfier pair of strands 
(Nunes-Duby et al, 1987; Kitts & Nash, 1988). 
During these reactions, Int becomes covalentiy 
attadied in cis to the broken DNA strand through 
a 3'-phosphotyrosine linkage, which is sub- 
sequently resolved when the 5'-hydroxyl group of 
the invading strand attacks the linkage and dis- 
places the recombinase (Figure 1(c) and (d)); 
Burgin & Nash, 1992; Nunes-Diiby et al, 1994). 
NeiSier supercoiling of attP nor the presence of 
IHF seems to be required for catalysis of these 
chemical reactions. Integrative recombination even- 
tually leads to the formation of atfL and attR, 
which are flie targets for exdsion (Figure 1(d)). 

Exdsive recombination is genetically the exact 
reversal of integration, but employs different 
nucleoprotein structures on atth/attR. In addition 
to IHF, which again serves mainly as an architec- 
tural protein at these sites (Goodman et al., 1992), 
the X pbage-encoded Xis protein is required for the 
fonxtation of a recombinogenic nucleoprotein com- 
plex at attR. Thus, in contrast to integrative recom- 
bination, two separate nucleoprotein structures are 
formed before synapsis occurs by random collision 
(Kim & Landy, 1992). It was also shown that Int, in 
the absence of any accessory proteins, can align 
flifP and attL within a bi-molecular complex, and 
can even recombine pairs of offL (Segall & Nash, 
1993, 1996). However, the order of strand exchange 
during excisive recombination is the same as 
observed for integrative recombination. 

According to a recently proposed model, the 
switch from top to bottom strand exchange 
involves isomerization of the Holliday junction 
with concomitant restacking of base-pairs within 
the overlap region (Nunes-Diiby et ah, 1995; Azaro 
& Landy, 1997). How Int controls this step and, 
thus, ensures the order of strand exchange dviring 
integrative and excisive recombination is not 
known. There is evidence that specific protein-pro- 
tein interactions are required between at least three 
Int protomers bound to core sites within a Holli- 
day junction (Frarxz & Landy, 1990; Kho & Landy, 
1994). In addition, Int mutants have been isolated 
which indicate that specific interactions between 
Int monomeis are important in the ccK)rdination of 
strand deavage events widiin flus intermediate 
structure -(Han- et al, -- 1 ^9 4) . 'Conformational 
changes of both the cataljrtic domain and the mol- 
ecular interface within and between Int monomers, 
respectively, are probably involved in co-ordinat- 
ing the sequence of strand exdianges. This can be 
inferred from the stiuctuie of the Cre-iaP Holliday 
junction intermediate (f^ap&vl a al., 1998). "the 
exact roles for IHF and Xis in this co-ordination is 
also unclear. There is evidence that both proteins 
control the effidency of resolution of Holliday 
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junctions in the direction of recombinant products 
(Franz & Landy, 1995). 

Here, we show that two mutant Int variants, 
E174 K and E174 K/E218 K, alter the directionaKty 
of recombination reactions in vivo, leading to del- 
etion instead of inversion on substrates that carry 
two ait sites as inverted repeats. The efficiency of 
this reaction depends 'on the type of att sites 
employed and Ihe presence or absence of MF. 
Nearly 100% deletion occurs, for example, with 
substrates bearing inverted attL and affP sites in 
the presence of IHF. However, neither Xis nor Fis 
is involved in deletion. Based on DNA sequence 
ir\formation obtained from various deletion 
products, we discuss two possible mechanisms 
leading to deletion. 

Results 



Integrase mutants 

Here, we analyze the in vioo catalytic activities of 
two Int mutants. Hie first one, termed Int-h, was 
originally identified by Miller et al. (1980) in a 
screen for X mutants that overcome the block 
imposed on recombination by the l?jmA42 
mutation of E. coli. The mutation replaces a gluta- 
mate residue with a lysine residue at codon 174 
and maps within the N-tenniml region of the cata- 
lytic domain (Figure 2). Subsequent purification 
and characterizatioh of Int-h revealed that it pro- 
motes integrative and excisive recombination in 
the absence of accessory proteins and supercoiling, 
albeit with significantly reduced efficiencies 
(Lange-Gustafson & Nash, 1984). It is proposed 
that this mutation results in an enhanced affinity 
for core sites, whidi would accoimt for the 
increased frequency of in vivo and in vitro inte- 
gration into secondary aU sites that deviate firom 
the wild-type attB sequence (Miller et aZ., 1980; 
Patsey & Bruist, 1995). 



Figtu^ 2. Genetic map of \ Int. The map highlights 
the three important functional domains of Int involved 
in DNA ann-binding, core^jinding, and cataly^ 
Crirumalai et aZ., 1997). Numbers refer to amino add 
residues. Open drdes indicate the position of Alalffi 
and Alal26, which make dose contact with bases in flie 
cqre binding site (Tirumala! et al., 1998). Filled aides 
mark flie positions of three strictly conserved residues 
widiin the Lfit family of recombinases, Le Arg212, 
His308, and AigSll (reviewed by Nunes-Duby et al., 
1998). The asterisk marks the active site residue, tyrosine 
342. Crosses demarcate the positions of four residues 
that seem involved in detennining recombination sped- 
fidly with respect to target sites (Yagil d d., 1995; 
Doigai et al., 1995). tit-h and fat-h/218 mark flie cone- 
sponding amino add dianges wiOun two mutant Int 
variants analyzed in fl>e presexa study (see the text). 



Our initial goal was to further enhance the abil- 
ity of Int-h to perform integration into secondary 
att Bites in the absence of supercoiling. We thus 
introduced, by PCR-directed mutagenesis, a second 
mutation into the catalytic domain of Int-h, which 
replaces a glutamate residue at codon 218 with a 
lysine residue. Our choice for mutating this specific 
residue is based on a recent finding (Wu et al., 
1997) that lysine at this position improves binding 
of wild-type Int to core sites, presumably flirough 
non-«pedfic contact(s) with the DNA backbone. 
We will subsequently refer to the double mutant as 
Int-h/218 (Figure 2). 

In vivo catalytic activities of Int-h and Int-h/218 

We first tested whether Int-h/218 retains the 
ability of Int-h to promote integrative recombina- 
tion in the absence of IHF in vivo. Expression vec- 
tors for Int, Int-h, and Int-h/218 were co- 
transformed with pNCl, which carries attB and 
flttP as inverted repeat (Table 1), into either £. coli 
sfarain CSH26 or an isogenic variant, CSH 26AIHF. 
In the latter strain, both genes encoding for sub- 
units of IHF have been destroyed by transposition. 
Resulting single colonies were cultivated over- 
riigjit, plasmid DNA isolated, subjected to restric- 
tion digits, and analyzed by agarose gel 
electrophoresis. 

Iht and the two variants perform effident inte- 
grative recombination in the presence of IHF, lead- 
ing to 100% inversion without induction of gene 
©qjression by IPTG (Figure 3, lanes 4 to 7). Hence, 
leaky Int expression from the p^ promoter in tihe 
presence of lac repressor bound to its operator 
sequences is suffident to promote complete inver- 
fflon over the time-course of the experiment. How- 
ever, in fite absence of IHF, Int is completely 
inactive and inversion catalyzed by Int-h is barely 
detectable (lanes 8 and 9). Lnt-h/218 executes 
inversion with comparable low effidency, but we 
were surprised to detect a more prominent product 
band migrating above ttiat of ttie expression vector 
(demarcated del.; Figure 3, lane 10). The yield of 
this new product increases over time (lane 11), but 
varies considerably between experiments (data not 
shown). Hie same produd could also be deteded 
with Int-h in some but not all experiments 
(Table 1). 

Undigested plasmid DNA derived from a 
sample obtained writh Int-h/218 was resolved 
through dectrophoresis, flie new product DNA 
isolated and re-transfonned into E. coli. Restriction 
analysis of plasmid- DNAfsAem-single colonies - 
revealed that the products contain a deletion 
between attB and attF (data not shown). DNA 
sequencing confiimed that notion. We fotmd two 
di^ont products. The first, termed «ttAl, results 
from reoMitbination tiiat joins the left core site of 
atfB, tte so-called B aim, to tfie left core site of 
fltfP, the P arm (see figure 6(a)). bi tiie following, 
we refer to this type of new att ate with teveised 
geiietic polarity, i.e. BOP instead of BOF, as a 
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Table 1. DNA substrates and catalytic activities of mutant and wild-type integrase 
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ic representation of recombination sobstiates. Ihe rdative posifions and orienbtions of relevant genetic elements present 



on DNA substrates employed in this study are indicated. Anrows demarcate ttie attachment (att) sites, ibe open rectangle inarks the 
position of the kanamydn resistance gene, and tite filled rectangle represenls the pACYC origin of replication. offP*- retes to flie pre- 
sence of a nucleotide diange within the overlap region (see &e text), lite arm ates within eadi att artd rdevant podtions of deavage 
sites for restriction enqrmes are also indicated. . . - 

• The (+) sign refers to the presence of the roiieqponding DNA band after eMdin^ 
Int-h is sigiuficantly less active than Int-h 218. 

' Inveraon observed after 48 hours V 
. "Complete loss of substrate vector. . ~* . :. 

•BardydetectabtewilhM-h/218andfat-hafter24and48lK>urs»req>ectiydy. .,, 

' Not detennirted. The expected recombination products, Le. Invasion for pNQ Oirbug^ pNC8 and deletion for pNC9 through 
pNClZ, are bwffid. 



hybrid site. Our analysis also revealed that in the 
particular orientation chosen to depict altAl (BOP) 
with respect to attB/attF (see Figure 6(a)), the top 
strand of the overlap region within attAl is 
derived from the bottom strand of affP. However, 
the top strand of the overlap in the second pro- 
duct, attAl, is derived frran the top strand of attB 
(see Figure 6(a)). From a total of 12 analyzed 
sequences, each derived from a single colony after 
re-transformation, we found that eight ccmtain the 
overlap provided from «ffP while four cany the 
overlap region from attB. 

Int-h is known to recombine pairs of att sites 
which differ in thdr overlap sequence by one or 
more base-pairs more effidacitly than wild-type Int 
(Kitts & Nash, 1987; Tatsqr & Bruist, 1995). This 
feature is, at least in part, ascribed to the enhanced 
affinity of Int4i to core sites. In order to test 
whether deletion can be detected and, if so, how 



Int-h/218 performs on such a substrate, we con- 
structed pNC6, which contains atfB and a variant 
form of flftP (flffP*) as inverted repeats (Table 1). 
Within flffP* a guanine base replaces the third 
nucleotide, a thymine residue, of the overlap 
sequence. pNC6 was co-transformed with Int 
eq^ression vectors, and plasmid DNA was stib- 
Jected to restriction analyas. While in the absence 
of MF, all three Int variants are completely inac- 
tive in integrative recombinaii^ (data- not shovm; - 
Table 1), both Int-h and, with enhanced efficiency, 
Int-h/218 execute inversion in the presence of IHF 
(Figure 4, lanes 2 and 3). However, both mutants 
in addition generate a second prominent product 
migrating at the top of the gel. The new product 
was isolated .and amplified ^as before, -and its 
sequence revealed ^t the B arm of attB js joined 
to the P aim of atfP, as observed before with 
pNCl. In this case, however, we found that the 
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Figure 3. In vtoo catalytic acBvities of wUd-type and mutant Int on pNCl. pNCl, which carries aftB and a«P as 
mverted repeats, was co-transfbimed into E coli wifli expression vectors for either wild-type or mutant Int (see 
MatCTiab and Methods). Isolated plasmid DNA was incubated with Aval and the resulting restriction fragments ana- 
yzed through agarose gd electrophoresis. We show a gd after ethidium bromide staining. Lane 1, kb marker ladder- 



expressmg Int, Int-h, or Int-h/218, respectively; lane 11, same as lane 10, but DNA isolated after an additiS 
24 houis mcubation^ pNCl, unrecombined substrate DNA; expr. vec., Int expression vectoi; inv., one of two product 
bands that results from mveisiOTi (the second co-migrates with the expression vector); dd., deletion product whidi is 
deaved only once due to the absence of a second restriction ate. The ast^isk marks flw portion of a Btird product 
band which is present m some experiments and which results from homotogous lecombinatian as determfried by 
restriction analysis. . _ •■ _ ^ ' 



overlap originates in 11 out of 12 analyzed 
sequences from attB (data not shown). 

Intrigued by the relatively high yield of deletion 
products obtained with pNC6, we investigated 
whether other pairs of inverted alt sites can lead to 
deletion. We therefore constructed two substrates 
for exdaive recombination, pNC2 and pNC7, 
which contain «ftL and affR as inverted repeats 
CTable 1). The atiR site in pNC7, termed aUR*, car- 
ries the same mutated overlap sequence as present 
in affP*. We found that in the absence of IHF (and 
Xis), Int-h and Int-h/218 deleted the segment 
between aft sites in pNC2 with the same efficiency 
as observed vwth pNCl. Deletion on pNC7 was 
barely detectable with Int-h/218 in the presence of 
IHF, and could only be detected after 48 hours of 
in vivo incubation with Int-h. However, it is note- 
worthy that we were unable to detect inversion on 
pNC7 either in the presence or absence of IHF 
(Table 1). This may indicate that deletion requires 
initial steps of the conventional strand exchange 
pathway (see E>iscussion). 

The vnld-type atl sites tested so far are converted 
by conventional strand exchange into the expected 
recombination products (Table 1; see Figure 6(a)). 
It is impossible to determine from these exper- 
iments whether one or both pairs of sites, i.e. 
(ttttB/atfP) or/and (affL/affR), are the substrates 
for deletion. We therefore constructed a substrate 
tihat cannot be altered by conventional strand 
exchange, pNCi, which carries attF and aUL as 
inverted repeats (Table 1). If recombination occurs 
between ttiese sites, this will lead to inversion. 



Their genetic polarities do not . <±ange, however 
(see Figure 6(b)). 

We found that in the presence of - IHF, ^iieaily 
100 % of pNC4 is converted into a deletion pioi- 
duct upon reaction with either Irit-h or Int-h/218 
(Figure 5, lanes 5 and 6).' Wild-type Int executes 
only inversion (lane 4). In the absence of IHF, 
Int and the two .variants " exclusively catalyze 
inversion Qanes 7 to 9). Samples of undig^ted 
deletion products obtained from Int-h/218 and 
Int-h were isolated, ~re-tiansfonned, '>ahd DNA 
from ten and nine colonies, respectively, sub- 
jected to DNA sequencing. In each case, we 
found two hybrid att sites. The first, termed 
flffA3, was present three or two times, respect- 
ively, and contains the B arm joined to the P 
arm (Figvire 6(b)). In the orientation .chosen for 
flffA3 in Hgure 6(b) (BOP), both arm sites are 
separated by the overlap provided ifirom the 
bottom strand of fl«P. In ' the second product, 
attA4, which was found seven times in each 
case, the overlap originates from -the top strand 
of flffL. From this analysis, we conclude that 
inverted off L and MF cans3»ifve^ substrate for ^• 
a highly efficient change in the directionality of 
recombination by mutant Int in the presence of 
IHF, leading to almost 100% deletion instead of 
inversion. . ~ 

We further tested whether other E. co/i proteins, 
in addition to IHF, may play a role in deletion. At 
the. present stage using various E. co/z strains (see 
Materials and Methods), we found ftat ncA/recB, 
and recC are not required for deletion with pNC4 
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Mtalytic activities of wild-tjrpe and 
J wild-type 



Figiire 4, In 
mutant Int on pNC6. pNC6, which carries 
flftB and a mutated attP (atiP*) as inverted repeats was 
co-transformed with Int expression vectors as before, 
and plasmid DNA isolated and subjected to gel dectro- 
phoresis after restriction digest with Aval (Table 1). 
Lanes 1 to 3, DNA isolated from cells expressing Int, 
;Iht-h, or Int-h/218, respectively, in the presence of IHF; 
lane 4, unrecombined pNC6; lane 5, expression vector 
alone; lane 6, kb marker ladder, inv. demarcates the 
portion of the two product bands that result from 
inversion, del. points to the position of the linearized 
deletion product. 



and' pNC6. In adition, deletion occurs in the 
absertce of Fis on a derivative of pNC6 (data not 
shown). 

Only one of two possible new hybrid att sites 
is generated 

So far we have analyzed the products that resvilt 
from deletion on substrates that contain inverted 
fltt -sites, leading to the identification of a new 
hybriid site with reversed genetic polarity, i.e. BOP 
instead of BOP'. In order to address the question 
whetirer mutant tit can also generate the corre- 
sponding second hybrid site composed of B' (or 
aim, overlap, and F arm (B'OF), we first con- 
structed a series of substrates that contain different 
pairs of att sites as direct repeats (pNG9 to pNC12; 
Table 1). If the mutant Int proteins execute a com- 
plete set of alternative strand exchange reactions 
which involves all four DNA strands piesoit 
witiiin a synaptic complex, this should resi:dt in 



Figure 5. In vivo catalytic activities of wild-type aitd 
mutant Irit on pNC4. pNG4, which carries inverted atfL 
and at(P, was co-transformed with Int expression vec- 
tors and isolated plasmid DNA processed as described 
before, except that BamlU was used as endonudease 
(compare to Table 1). Lanes 1 and 10, kb marker ladder; 
lane 2, expression vector alone; lane 3, unrecombined 
pNC4; lanes 4 to 6, DNA isolated fix)m cells e 



IHF; lanes 7 to 9, same as 1 
absence of IHF. pNC4 demarcates the position of the 
three DNA fragments Oral result from digestion of 
unrecon^ined pNC4 (note that the two smaller frag- 
ments exhibit die same length and, thus, co-migrate); 
inv. indicates fiie position of two of three fiagmenis &at 
result from inversion (the third fragment exhibits the 
..same size as the largest fragment obtained from jun- 
recoiiibined DNA); deL marls the position of. one of 
two DNA fragments that result from deletion (the size 
of the second fragment does not change as a result of 
deletion). 



inversion instead of deletion. However, we found 
that deletion occius on all four substrates, but 
inversion between directly repeated att sites was 
not detectable, in either the presence or absence of 
IHF. These experiments also revealed that Int-h/ 
218 is significantly more active than Int-h in cata- 
lyzing deletion on pNC9 and pNClO in the 
absence of IHF (Table 1). Hence, Int-h/218 exhibits 
an enhanced ability to execat^f recombination on - 
wild-type cff sites in the absence of accessory fac- 
tors IHF and Xis. 

Our failure to detect the second hybrid site 
could so far be due to the possibility that a func- 
tional synaptic complex cannot be formed because 
an unknown topological constraint is imposed on 
synapsis with substrates carrying att altes as direct 
repeats. In order to exclude this possibility, we 
again employed the pair of atlL and atfP which 
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Figure 6. (a) DNA sequences of ddetioh .pnducts 
obtained wiih pNCl. The sequence of the icoiie and over- 
lap legions from attB and attF are shown at the ]eft ^de. 
Both att sites are aligned in paraDe], as indicated by 
blade arrows. The capital letters B, V, P, and F mark 
the coiresponding core site sequences. The two overlap 
sequences are boxed and marted as O. The open arrow 
heads point to the positions of top and bottom strand 
deayage by Int. Depicted at the top " right are the 
sequences and genetic polarities of the two products, 
attL and attK, that result horn conventional integrative 
recombination. Shown at the bottom right are the DNA 
sequences and genetic polarities of Gie two hybrid att 
sites, termed attAl and attA2, which are present on del- 
etion products. Note that in attAI the overlap sequence 
is provided from the P arm, while that present in ottAZ 
comes from tiie B aim. (b) DNA sequences of deletion 
products obtained with pNC4. The sequences of «ttL 
and attP are depicted at the left side. Both sites are 
aligned again in parallel. Symbols are as described in 
(a). The two products of conventional strand exchartge 
are shown at the top right Note that fte composition of 
sites with respect to both the core sites and flieir flank- 
ing regions does not change during conventional strand 
exchange. Depicted at the bottom right are the 
sequences of the two hybrid att sites, att A3 and attAi, 
whidt are identical with attAl and AftA2, respectivdy. 



gives die highest yield of deletion products v^rith 
pNC4 In tfiis case, however, we placed the 
inverted off sites in a different oiiehtation with 
respect to both the plasmid origin and the re^- 
ance marker gene (pNC5; Table 1). If deletion 
occurs, a second hybrid att site (POP) ^ould be 



generated that can be propagated by plasmid repli- 
cation. The first identified site, BOP, will in this 
case be lost because ttie deleted DNA segment 
does not contain a replication origin. While we 
were able to detect inversion in the abseiKC of IHF, 
products that result from deletion in the presence 
of IHF are missing. Instead, we observed that cells 
completdy degrade pNC5 when either Int-h or Int- 
h/218 is present (Table 1). To test whether plasmid 
degradation is due to the instability of the expected 
recombination product carrying POP, we con- 
structed a plasmid designated pPOP which con- 
tains the equivalent sequence of one of the two 
expected recombination products (see Materials 
and Methods). We found, however, that pPOP is 
stably propagated in the presence of Int-h (data 
not diown). Hence, the instability of pNC5 cannot 
be traced back to the instability of the expected 
recombination product carrying the second hybrid 
site. These results in conjunction with our i^ilure to 
detect inversion on substrates pNC9 to pNC12 
ndg^t indicate ttiat both mutant Iht proteins effi- 
ciently execute strand cleavage without subsequent 
ligation to generate the secoiid hybrid att site. 



Discussion 

Integrative and exdsive recombination per- 
formed by Int normally lead to inversion of DNA 
segments when the corresponding pair of target 
sites is present as an inverted repeat on the same 
DNA molecule. We have demonstrated in this 
study that mutant Int, in addition, executes an 
alternative reaction which leads to deletiori. The 
most efficient reaction, resulting in nearly 100% 
deletion in the presence of IHF, occurs on pNC4 
whidi carries atth and flffP (Kgure 5). 
" One possible model to accoimt for the results is 
that dne two mutant Int proteins have lost the abil- 
ity to distingui^ between the core binding sites 
present to the left and to the right of the overlap 
region. This would allow synapsis between two att 
sites in the wrong orientation. Reciprocal top 
strand exchange would then lead to mispaired top 
strands because of non-complementary * bottom 
strands present in the resulting Holliday jtmction. 
Despite these heterologies, catalytic events may 
proceed normally due to the presence of mutant 
Int protons witfi an enhanced afflruty for core 
sites, and the Holliday junction may eventually be 
resolved through reciprocal bottom strand 
exchanges. The resulting heteroduplex structures at 
tiie overlap region of baeFVfbriSSC^att ates could " 
then be resolved dmmg^ repair and/or plasmid 
replication. 

While this presumably represents the most 
simple scenario leading to deletion, we think it is 
imlikely for two reasons. First, the model predicts 
tiiat two hybrid att sites should be generated due 
to redproral top and bottom strand exchanges. 
This should lead to inveraon on substrates pNC9 
to pNC12, and to deletion on pNC4 and pNC5. 
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Our results show, however, that inversion is not 
detectable, that deletion occurs only on pNC4 and 
not on pNC5, and that pNC5 is lost despite selec- 
tion for the antibiotic resistance on this plasmid 
(Table 1). Second, the model predicts that pre-exist- 
ing heterologies between overlap sequences of a 
pair of att sites should not be important because 
the strands will eventually be mispaired in a Holli- 
day juiKtion after the first strand exchange is com- 
pleted. Hence, deletion on pNCl and pNC6, the 
latter carries attP*, should occur under the same 
conditions, i.e. in the presence or absence of IHF, 
and presimiably with comparable efficiencies. The 
results show, however, that ddetion on pNCl 
occurs only in the absence of IHF while deletion 
cm pNC6 is observed only in the presence of IHF 
and, in addition, occurs with an enhanced effi- 
ciency (Figures 3 and 4). Control experimoits show 
that recombination between attF* and a variant of 
aHB carrying the same nucleotide exchange within 
the overlap region proceeds normally in the pre- 
sence of IHF (data not shown). It is unlikely, there- 
fore, that the different IHF requirements for 
deletion on pNCl and pNC6 are due to a sequence 
effect imposed by the overlap in affP*. 

In the following, we propose a second possible 
pathway leading to deletion. We will focus our dis- 
cussion primarily on deletion observed with pNC4 
because this particular substrate has the advantage 
that the composition of atth and aftP does not 
change during the course of conventional strand 
exchange (Figure 6(b)). It is reasonable to assume 
that iri the first step, nucleoprotein structures sep- 
arately assemble on atth and cffP in the preserKe of 
IHF (Segall & Nash, 1993). After site synapsis in 
the correct orientation, the first strand exchange is 
completed (Figure 7(a) and (b)). This results in a 
HolUday junction, which can be resolved normally 
by reciprocal ''bottom" strand exchanges (compare 
to Figure 1). However, we think that diiring a sub- 
sequent isomeiization step, which is required to 
switch from top to bottom strand exdiange 
(Nunes-Diiby et al, 1995; Azaro & Landy, 1997), 
Int-h and Int-h/218 accidentally cleave all four 
strands either sequentially or simultaneously. 
Ihis will lead to two double-stranded breaks 
(Figure 7(c)). Based on DNA sequence information 
obtained firom deletion products, we conclude fliat 
ttie 5'-OH group from flie overlap strand exteiuling 
the P arm (labelled p) engages in a nudeophSBc 
attack on the 3'-phosphotyrosine lirJcage between 
Int-h and the B arm, hence replacing the recranbi- 
nase. Likewise, the overlap strand still linked to 
the B aim (labelled b) attadcs the Int-DNA linkage 
at ate P arm (Figure 7(c) and (d)). Wheflver ligation 
of both strands occurs on One same DNA molecule 
in vivo is uncertain. It is possible that on individual 
substrate molecules, only one of these strands is 
ligated. If so, a subsequent repair involving gap 
filling has to occur on the second strand by E. coU 
proteins. However, this scenario would lead to the 
potential problem tiiat one Int-h monomer remains 
covalently linked in c& to the corresponding core 



(a) (b) 

it 

(d) (c) 

Figure 7. Sdtemaiic representation of one possible 
reaction pafliway leading to deletion on pNC4. (a) alfL 
and atCP are depicted in antiparalld orientation within a 
hypoffietical synaptic complex. The Int monomer (filled 
oval) bound to either the B or P arm initiates a nudeo- 
philic attack on the top strand within each att site (indi- 
cated by curved arrows), (b) The firsit reciprocal pair of 
strand exchanges between top strands has been com- 
pleted, leading to a Holliday junction intermediate struc- 
ture, (c) All four Int-h (Iht-h/218) monomers engage in 
a nudeophilic attack against flieir conespdnding core 
sites and become covalently connected in c& Quough a 
3'-phosphotyTosine linkage. This will lead to a double- 
stranded break within each att site, '(c), (d) The 5'-OH 
from file overlap strand connected to die B ann in atiL 
(labelled b) is rqoined with the P arin from otiP. like- 
wise, the 5'-OH from tiie overlap strand which is still 
connected wiOi the P aim in affP (labelled p) is figated 
with the B aim in cttiL. Ihis leads to the formation of a 
new hybrid off site wifli correct cheinical polarity, but 
wifti reveised genetic ptdarity (i.e. BOP), Note that if 
both overlap strands, p and b, are religated within the 
same DNA molecule, the seven base-pair overlap region 
will adopt a heteroduplex structure containing five out 
of seven non-complementaiy bases. Based on the results 
presented in this study, the conespohding second 
hybrid att site (FOP') cannot be formed fivrough leliga- 
tion. It is inferred that these DNAs which still cany Int 
monomers covalently linked at flie 3' ends will be 
degraded by £. cofi proteins. 



site and has to be rembvedprar'lo'TigatiQn. At the 
moment, we favor the posability tihat both strandis 
can be ligated on a ^gle substrate mokciile. This 
would lead to the formation of a seven base-pair 
heteroduplex structure in the overlap region of the 
new hybrid att site (Figure 7(d)), which could be 
resolved ihiougli repair and/or plasmid replica- 

tiCHU 

We have shown that the corresponding second 
hybrid att site caimot be generated by the mutant 
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Int Based on our observation that the substrate 
DNA is lost with pNC5 in the presence of IHF, we 
think it is likely that E. coll enzymes degrade the 
linearized, origin-bearing DNA that contains Int-h 
monomers covalently linked to the 3' ends 
(Hgure 7(d)). This implies that double-stranded 
cleavage is efficient and eventually occurs on the 
entire population of pNC5 molecules inside E cdli. 
This, in turn, is in agreement with our observation 
of nearly 100% deletion on pNC4. Deletion on 
pNC2, on the other hand, is much less efficient. 
This can explain why pNC3 is stably propagated 
pTable 1). The fact that we were unable to identify 
the second hybrid att site suggests furthermore 
that strand ligation occurs within the framework of 
a synaptic complex, and not by random coUision of 
freed DNA ends. 

The role(s) of the accessory protein IHF in chan- 
ging the directionality of strand exchange is puz- 
zling. IHF is strictly required for deletion on pNC4 
(Figure 5 and Table 1). One possibility is that 
additional Int-h (Int-h/218) monomers are deliv- 
ered to the core sites of at(L from its flanking arm 
binding sites via IHF-induced DNA-bending (Moi- 
toso de Vargas et al, 1988, 1989). Perhaps fee pre- 
sence of additional mutant Int monomers with an 
enhanced affinity for core sites interferes with the 
co-ordination of strand cleavage during isomraiza- 
tion of Holliday jimctions. In contrast, deletion on 
pNCl is only observed in the absence of IHF 
(Figure 3). IHF and Xis seem to direct Holliday 
junction resolution towards recombinant products 
in integrative and excisive recombination (fronz & 
Landy, 1995). It is possible that without these 
accessory proteins, the isomerization step r«juired 
to switch to bottom strand exchange is impaired, 
so that the amoimt of this intermediate structure 
transiently increases. This, in tum, could result in a 
"cleavable synaptic complex" (Figure 7(c)) due to 
the presence of mutant instead of wild-type Int. 
A similar reasoning could account for deletion 
observed with substrates carrying inverted pairs of 
wild-type attL and aUR (pNC2; Table 1). The atu- 
ation is differ^t again vritti pNC6 and pNCT, 
which recjuire IHF for ddetion (Figure 4 and 
Table 1). Since the overlap sequence in these pairs 
of att sites differs at position 0, it is likely that IHF 
is required to overcome an impairment in the iso- 
merization step imposed by such a heterology^ 
This again could lead to a transient increase in flie 
amount of Holliday junctions, which are eitiier 
resolved through reciprocal bottom strand 
exchange or deaved and rejoined by mutant Int to 
yield deletion products. 

The formation of so-called "contrary" recombi- 
nant products by wild-type Int has been observed 
before in in vitro studies using either heteroduplex 
or half-fltf site substrates (Nash & Robertson, 1989; 
Nunes-Duby et al., 1989, 1997). These products, 
termed Y-structures, contain one reccnnbinant 
strand that results from conventional top strand, 
exdiange, while the second strand diows normal 
diemical polarity but with reversed genetic 



polarity (e.g. BOP). Complete new hybrid att sites 
with reversed genetic polarity in both strands, as 
shown in the present study, were not observed. 
These Y-sh^ictures most likely result as a direct 
consequence of the aberrant structures of att sites. 
However, an important finding of these studies is 
that Int can join strands indiscriminately in the 
absence of complementary (homologous) strands. 

Hie present study shows that true contrary 
recombinant products are generated in vivo with 
wild-type substrates for integrative and excisive 
recombination in the absence of IHF. It is import- 
ant to note that these products are observed oiJy 
with mutant Int. It is therefore an intrinsic property 
of Int-h and Int-h/218, and not that of a particular 
recombination substrate, that leads to fee ol^rved 
change in directionality of recombination. It is 
possible feat fee presence of a lysine residue 
instead of a glutamate residue at position 174 
within fee catalytic domain of Int somehow inter- 
feres wife fee normal commtmication between Int 
monomers within a Holliday junction. This could 
lead to an tmco-ordinated nudeophilic attack on 
all four strands. Alternatively, or in addition, fee 
presence of this lysine residue may interfere wife 
fee normal isomerization step of Holliday junc- 
tions, leading to a conformational change within 
this intermediate DNA structure feat allows four 
Int monomers to attack fee DNA backbones. An 
inspection of fee recentiy solved structures of fee 
Cre-/ox recombination synapse and Holliday junc- 
tion may be informative here (Gopaul ei al, 1998; 
Guo et al, 1997). In fee recombination synapse, ala- 
nine 131 and ly^e 132 from the Cre subimit feat 
has deaved fee loxA site are in contact wife EWA. 
A sequence comparison of 105 members of fee Int 
femily shows feat lysine 174 in Int-h aligns wife 
alanine 131 in Cre O^imes-Duby et cZ., 1998). This 
is consistent wife our hypofeesis that Int-h ^t-h/ 
218) interferes wife fee isomerization of Holliday 
junctions required to svdtch from top to bottom 
strand exchange, possibly through additional DNA 
contact(s) within this intennediate structure. ' ' 



Materials and Methods 

Bacterial strains 

The following E.coli strains were employed in this 
study. DH5a {supEM iilacU169 (<t80kcZAM15) hsdRl? 
recAl endAl gyrA96 tW-J reM27«SlBaTaivl983); DHl (f- 
supE44 recAl endAl gyrA96 thi-1 hsdRl? rdAl X) 
(Hamhan, 1983); JC5547 (fhr-I am-14 leuB6 A(gpt- 
proA)62 lacYl tsx-33 glnV44(AS) galK2(Oc) X' hisG4{Oc) 
recAlS recC22 recBll rpsL31(strR) xylAS mti-l argE3(0c) 
thi-l; Willetts & Clark, 1969); CSH26 (F" araAOacpro) (hi; 
MiUer, 1972); CSH26AmF (F" araAdac .pro) thi 
himAA82::Tnl0ac^) himDA3:xai(ai^); kmdly provided 
by B. Rak, Freiburg, Gemiany); CSffiO (F" araA(lac pro) 
fei strA- Miller, 1972); and CaiSOAKs (F" araA(lac pro) 
tW strA fis=kan; Koch rf 1988). 
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Construction of integrase expression vectors 

Int and Int-h were subdoned by PCR from pHNl and 
pHN16 (Honigman et al, 1979; Lange-Gustafson & 
Nash, 1984), respectively. Both genes were introduced 
into the polylinker from pTrc99A (Phannada) in which 
the Ncol site has been destroyed. Expression from 
pTrc99A is under the control of die strong trc promoter 
containing fte trp (-35) and the he UV5 (-10) region 
separated by 17 bp. Expression Is regulated by the 
hcP gene product also encoded on &e same 
phsmid. Both genes were amplified using the following 
primers: "IntproNl" whidi binds at £he 5* . end 
(5'-GCrCTAGAATGGGAAGAAGGCGAAGTCA-3') and 
"IntpioCO" binding at the 3' end (5^-AAAACTGCAGT 
CATrATrrGATITCAATnTGTCX:C-30- PCR products 
were resolved tiirough agarose gel electrophoie^, iso- 
lated, and digested wi£h XM aiul PsfL The fragjnents 
were doned into pTrc99A which was linearized by 
Xbal/Pstl, and the resulting expression vectors (pTrdnt 
and pTrcInt-h) amplified in DHSot. 

The double mutant Int-h/218 was generated from 
piTrcInt-h by PCR-directed mutagenesis, which substi- 
tuted the guanine at position 652 of the Int-h gene by an 
adenine residue. The resulting codon change at position 
218 replaces the glutamate residue with a lysine residue. 
The following oligonudeotides were used as primers: 
ncla218KR containing the altered nudeotide anneals at 
the coding strand of the Int-h gene (5'-GGTGATTTA 
TGCAAAATGAAGTGGTCTGATATCGTAGATGGA-3') 
and nclb218KL anneals at the non-coding strand 
and directs DNA synthesis in the opposite direction 
(S'-AACTCGTTGCCCGGTAACAACAGCCAGrrCCAT 
TGGAAG-SO- PCR was performed using the "Master 
Mix Kit" (Qiagen, Gennany). The resulting linear 
expression vector was purified through agarose gel elec- 
trophoreisis, itelated, and ligated after phosphorylation 
using T4 Ligase (New England Biolabs). Screening for a 
functional double mutant was performed through in vivo 
lecombixtation assays (see below). The sequence of the 
sub-doned bA, Int-h, and Int-h/218 gene was confirmed 
by DNA sequencing. 



Construction of recombination substrates 

Plasmids used as recombination substrates are listed 
in Table 1 and are derivatives of pACYC177 Ccirrying the 
kanamydn resistance gene pNCl (9.13 kb) bearing wild- 
type fltfP and flftB as inverted repeats was obtained by 
combining pAB3YC, a derivative of pAB3 (Droge & 
Cozzarelli, 1989), with pACYC177. pABSYC was par- 
tially digested with Nhel in order to remove a 1.4 kb ori- 
gin-containing fragment, and ligated wifl\ pACYC177 
which was linearized at its unique Nhel site. pNC2 
(9.13 kb) was generated by in vivo recombination of 
pNCl using wild-type bit in the presence of IHF (see 
below). pNC3 (9.13 kb) is a derivative of pNC2 in which 
the orientation of aUL and attR with respect to both the 
origin and resistance gene has been changed. pNC4 (8.2 
kb) was constructed by inserting a XninI-&oRV attP-con- 
taining fragment from pNCl into Scal-linearized 
pACYC177. atfL was obtained from pNC2 and siib- 
sequenfly inserted by blunt-end ligation into flw Et^l ate 
of the flttP-beaiing plasmid. pNC5 (8.59 kb) was derived 
from pNC2 by doning atiP, which is present on a -Nhel 
fragment into a different poation of affR-deleted pNC2. 
pNC6 (7.04 kb) was derived from pNCl by xepladng 
atlP vnth flttP*. The modified atfP ate was generated by 
FCR-diiected mutageneas and first domed into pTZlSR 



(Hiarmacia). pNC7 (6.67 kb) is a derivative of pNC2 in 
which fl«R has been replaced by ai(R*, the latter was 
obtained from in vivo recombination between a deriva- 
tive of fltfB, termed attH, and affP*. pNC8 (5.9 kb) is a 
derivative of pNCl, in which <rffP is replaced by atfR*. 
pNC9 (9.13 kb) is a derivative of pNCl, in which the 
orientation of attP has been inverted. pNClO (9.13 kb) is 
a derivative of pNC2 in whidi aUR has been inverted 
vwth reject to af/L, so flat bofl» nff sites are present as 
direct repeats. pNCll (8.20 13>) was constructed as 
described for pNC4, but in Ihis case selecting for the pre- 
sence of flff sites as direct repeats. pNC12 (7.04 kb) is a 
derivative of pNCl in which affP was replaced by affP* 
and screened for the desired orientation. pFOF was 
generated by doning two PCR-derived fragments into 
BamHI/Hmdni-deaved pACYC184. For this, we used 
the unique Ddel restriction site present in atfP. The 
sequence of the Int core binding sites and the overlap in 
FOP is as fbUows: S'-CAACrFAGTAT AAA- 
TAAGTrGGC-3'. It therefore represents one of two poss- 
ible recombination products that were ejqjected if 
deletion occurs on pNC5. 



In vivo recombination 

Expression vectors and recombination substrates were 
co-transformed into the appropriate £. coli sfrains men- 
tioned in the text. After incubation of single colonies 
overnight in the presence of ampidllin to sdect for ttie 
expression vector and kanamydn to select for the sub- 
strate DNA, transformants were cultivated at 37 °C for 
an additional 17 hours under selection pressure, and 
plasmid DNA isolated by affinity chromatography (Qia- 
gen, Germany). Recombination was analyzed througjh 
restriction digests using the appropriate endonudease 
(Table 1) and subsequent separation of DNA fragments 
through agarose gd electrophoresis. In order to test 
wheflier recA, recB or recC may be required for deletion 
on pNC4 and pNC6, we employed E. coli strains DHl 
and JC5547. The reqiurement for Fis was tested by com- 
paring deletion on a derivative of pNC6, which carries a 
spectomydn reastance gene, in CSH50 and CSHSOAFis. 



DNA sequencing 

Deletion products were sequenced using the fluor- 
escence-bas«3 373A system (Applied Bios)rstems). The 
following two oligonudeotides were used as sequendng- 
primers: ATT-PC which anneals at the P arm in 
the direction of the overlap region (S'-TTGATAGCT 
CTrCCGCnTCTGTrACAGGTCACTAATACC-3'), and 
ATT-BA whidi anneals at the complementary . strand 
within the B arm (5'-GTCTAGCTAGCCGGGAAACTG 
AAAATGTGTTC-3'). The sequence of subdoned Int 
genes and that of Int-h/218 was determined using three 
oligonudeotides as primers. Two of them anneal within 
pTrc99A dther upstream qr^mign^^fg^,^ of the poly- 
linker. llie'tiiiTd'anneals af nuclebtide. poatk>ns 331 to 
348 wititin tiie Int gene. 



Gel electrophoresis 

DNA was an^yzed flvoug^ agarose gel dectrophor- 
esis (0.8%) in TBE buffer (90 mM Tiis4x)rate (pH 8.3), 
23 mM EDTA). DNA was visualized by UV after stain- 
ing with ethidium bromide Photographs were taken 
with flie Image Master® System (Pharmada). 
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ABSTRACT 



Positive-negative selector (PNS) vectors are provided for 
modifying a target DNA sequence contained m the genome 
of a target cell capable of homologous recombination. The 
vector comprises a first DNA sequence which contains at 
least one sequence portion which is substantially homolo- 
gous to a portion of a first region of a target DNA sequence. 
The vector also includes a second DNA sequence containing 
at least one sequence portion which is substantially homolo- 
gous to another portion of a second region of a target DNA 
sequence. A third DNA sequence is positioned between the 
first and second DNA sequences and encodes a positive 
selection marker which when expressed is functional in the 
target cell in which the vector is used. A fourth DNA 
sequence encoding a negative selection marker, also func- 
tional in the target cell, is positioned 5' to the first or 3' to the 
second DNA sequence and is substantially incapable of 
homologous recombination with the target DNA sequence. 
The invention also includes transformed cells containing at 
least one predetermined modification of a target DNA 
sequence contained in the genome of the cell. In addition, 
the invention includes organisms such as non-human trans- 
genic animals and plants which contain cells having prede- 
termined modifications of a target DNA sequence m the 
genome of the organism. 

44 Claims, 13 Drawing Sheets 
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POSITIVE-NEGATIVE SELECTION 
METHODS AND VECTORS 

This invention was funded under grant No. ROl-GM- 
21168 issued by the U.S. Department of Health and Human 5 
Services. This is a continuation of application Ser. No. 
07/397,707, filed Aug. 22, 1989, now abandoned. 

TECHNICAL FIELD OF THE INVENTION 

10 

The invention relates to cells and non-human organisms 
containing predetermined genomic modifications of the 
genetic material contained in such cells and organisms. The 
invention also relates to methods and vectors for making 
such modifications. u 

BACKGROUND OF THE INVENTION 

Many unicellular and multicellular oiganisms have been 
made containing genetic material which is not otherwise 
normally found in the cell or organism. For example, bac- 
teria, such as E. coli, have been transformed with plasmids 
which encode heterologous polypeptides, i.e., polypeptides 
not normally associated with that bacterium. Such trans- 
formed cells are routinely used to express the heterologous 25 
gene to obtain the heterologous polypeptide. Yeasts, fila- 
mentous fungi and animal cells have also been transfonned 
with genes encoding heterologous polypeptides. In the case 
of bacteria, heterologous genes are readUy maintained by 
way of an extra chromosomal element such as a plasmid. 
More complex cells and organisms such as filamentous 
fungi, yeast and mammalian cells typically maintain the 
heterologous DNAby way of integration of the foreign DNA 
into the genome of the cell or organism. In the case of 
mammalian cells and most multicellular organisms such 
integration is most frequently random within the genome. 

Transgenic animals containing heterologous genes have 
also been made. For example, U.S. Pat. No. 4,736,866 
discloses transgenic non-human mammals cantaining acti- 
vated oncogenes. Other reports for producing transgenic 40 
animals include PCT Publication No. W082/04443 (rabbit 
P-globin gene DNA fragment injected into the pronucleus of 
a mouse zygote); EPO PubKcaUon No. 0 264 166 (Hepatitis 
B surface antigen and tPA genes undo: control of the whey 
acid protein promoter for mammary tissue specific expres- 45 
sion); EPO Publication No. 0 247 494 (transgenic mice 
containing heterologous genes encoding various forms of 
insulin); PCT Publication No. W088/00239 (tissue specific 
expression of a transgene encoding factor DC under conttol 
of a whey protein promotor); PCT Publication No. W088/ 50 
01648 (transgenic mammal having mammary secretory cells 
incorporating a recombinant expression system comprising 
a mammary lactogen-inducible regulatory region and a 
structural region encoding a heterologous protein); and EPO 
Publication No. 0 279 582 (tissue specific expression of 55 
chloramphenicol acetyltrans-ferase under control of rat 
p-casein promotor in transgenic mice). The methods and 
DNA constructs ("transgenes") used in making these trans- 
genic animals also result in the random integration of all or 
part of the transgene into the genome of the organism, go 
Typically, such integration occurs in an early embryonic 
stage of development which results in a mosaic transgenic 
animal. Subsequent generations can be obtained, however, 
wherein the randomly inserted transgene is contained in all 
of the somatic cells of the transgenic animals. 55 

Transgenic plants have also been produced. For example, 
U.S. Pat. No. 4,801,540 to Hiatt, et al., discloses tiie trans- 
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formation of plant cells with a plant expression vector 
containing tomato polygalacturonase (PG) oriented in the 
opposite orientation for expression. The anti-sense RNA 
expressed from this gene is capable of hybridizmg with 
endogenous PG mRNA to suppress translation. This inhibits 
production of PG and as a consequence the hydrolysis of 
pectin by PG in the tomato. 

While the integration of heterologous DNA into cells and 
organisms is potentially useful to produce transformed cells 
and organisms which are capable of expressing desired 
genes and/or polypeptides, many problems are associated 
with such systems. A major problem resides in the random 
pattern of integration of the heterologous gene into the 
genome of cells derived from multicellular organisms such 
as mammalian cells. This often results in a wide variation in 
the level of expression of such heterologous genes among 
different transformed cells. Further, random integration of 
heterologous DNA into the genome may disrupt endogenous 
genes which are necessary for the maturation, differentiation 
and/or viability of the cells or organism. In the case of 
transgenic animals, gross abnormalities are often caused by 
random integration of the transgene and gross rearrange- 
ments of the transgene and/or endogenous DNA often occur 
at the insertion site. For exanq)le, a common problem 
associated with tiansgenes designed for tissue-specific 
expression involves the "leakage" of expression of the 
transgenes. Thus, transgenes designed for the expression and 
secretion of a heterologous polypeptide in manamary secre- 
tory cells may also be expressed in brain tissue thereby 
producing adverse effects in the transgenic animal. While 
the reasons for transgene "leakage" and gross rearrange- 
ments of heterologous and endogenous DNA are not known 
with certainty, random integration is a potential cause of 
expression leakage. 

One approach to overcome problems associated with 
random integration involves the use gene of targeting. This 
method involves the selection for homologous recombina- 
tion events between DNA sequences residing in the genome 
of a cell or organism and newly introduced DNA sequences. 
This provides means for systematically altering the genome 
of the cell or organism. 

For example, Hinnen, J. B., et al. (1978) Proc. Natl. Acad. 
Sci U.S.A., 75, 1929-1933 report homologous recombina- 
tion between a leu2* plasmid and a leu2- gene in the yeast 
genome. Successful homologous b^sformants were posi- 
tively selected by growfli on media deficient in leucine. 

For mammalian systems, several laboratories have 
reported the insertion of exogenous DNA sequences into 
specific sites within the mammalian genome by way of 
homologous recombination. For example. Smithies, O., et 
al. (1985) Nature. 317, 230-234 report the insertion of a 
linearized plasmid into the genome of cultured mammalian 
cells near the p-globin gene by homologous recombination. 
The modified locus so obtained contained inserted vector 
sequences containing a neomycin resistance gene and a sup 
F gene encoding an amber suppressor t-RNA positioned 
between the 5 and p-globin structural genes. The homolo- 
gous insertion of this vector also resulted in the duplication 
of some of the DNA sequence between the 8 and |3-globin 
genes and part of tiie P-globin gene itself. Successful trans- 
formants were selected using a neomycin related antibiotic. 
Since most transformation events randomly inserted this 
plasmid, insertion of this plasmid by homologous recombi- 
nation did not confer a selectable, cellular phenotype for 
homologous recombination mediated transformation. A 
laborious screening test for identifymg predicted targeting 
events using plasmid rescue of the supF marker in a phage 
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library prepared from pools of transfected colonies was 
used. Sib selection utflizing this assay identified the trans- 
formed cells in which homologous recombination had 
occurred. 

A significant problem encountered in detecting and iso- s 
lating cells, such as mammalian and plant cells, wherein 
homologous recombination events have occurred lies in the 
greater propensity for such cells to mediate non-homologous 
recombmation. See Roth, D. B., et al. (1985) Proc. Natl. 
Acad. Sci. U.S.A., 82 3355-3359; Roth, D. B., et al. (1985), k, 
Mol. Cell. Biol, 5, 2599-2607; and Paszkowski, J., et al. 
(1988), EMBO J., 7, 4021-4026. In order to identify 
homologous recombination events among the vast pool of 
random insertions generated by non-homologous recombi- 
nation, early gene targeting experiments in mammalian cells 
were designed using cell lines carrying a mutated form of 
either a neomycin resistance (neo^ gene or a herpes simplex 
virus thymidine kinase (HSV-tk) gene, integrated randomly 
into the host genome. Such exogenous defective genes were 
then specifically repaired by homologous recombination 
with newly introduced exogenous DNA carrying the same 
gene bearing a different mutation. Productive gene targeting 
events woe identified by selection for cells with the wild 
type phenotype, either by resistance to the drug G418 (neoO 
or ability to grow in HAT medium (tk"^. See, e.g., Folger, K. 
R., et al. (1984), Cold Spring Harbor Symp. Quant. Biol., 49, ^ 
123-138; Lin, F. L. et al. (1984), Cold Spring Harbor Symp. 
Quant. Biol, 49, 139-149; Smithies, O., et al. (1984), Cold 
Spring Harbor Symp. Quant. Biol., 49, 161-170; Smith, A. 
J. H., et al. (1984), Cold Spring Harbor Symp. Quant. BioL. 
49, 171-181; Thomas K. R., et al. (1986), CeU, 41, 419-428; 
Thomas, K. R., et al. (1986), Nature, 324, 34-38; Doet- 
schman, T, et al. (1987), Nature, 330, 576-578; and Song, 
Kuy-Young, et al. (1987), Proc. Natl. Acad Sci. U.SA., 84, 
6820-6824. A similar approach has been used in plant cells 
where partially deleted neomycin resistance genes report- 
edly were randomly inserted into the genome of tobacco 
plants. Transformation with vectors containing the deleted 
sequences conferred resistance to neomycin in those plant 
cells wherein homologous recombination occurred. Pasz- 
kowski, J., et al. (1988), EMBO J., 7, 4021-4026. 

A specific requirement and significant limitation to this 
approach is the necessity that the targeted gene confer a 
positive selection characteristic in those cells wherein 
homologous recombination has occurred. In each of the 
above cases, a defective exogenous positive selection 
marker was inserted into the genome. Such a requirement 
severely limits the utility of such systems to the detection of 
homologous recombination events involving inserted select- 
able genes. 

In a related approach, Thomas, K. R., et al. (1987), Cell, 
51, 503-512, report the disruption of a selectable endog- 
enous mouse gene by homologous recombination. In this 
approach, a vector was constructed containing a neomycin 
resistance gene inserted into sequences encoding an exon of 55 
the mouse hypoxanthine phosphoribosyl transferase (Hprt) 
gene. This endogenous gene was selected for two reasons. 
First, the Hprt gene lies on the X-chromosome. Since 
embryonic stem cells (ES cells) derived from male embryos 
are hemizygous for Hprt, only a single copy of the Hprt gene go 
need be inactivated by homologous recombination to pro- 
duce a selectable phenotype. Second, selection procedures 
are available for isolating Hprt" mutants. Cells wherein 
homologous recombination events occurred could thereafter 
be positively selected by detecting cells resistant to neomy- 55 
dn (neo") and 6-thioguanine (Hprt"). 

A major limitation in the above methods has been the 



requirement that the target sequence in the genome, either 
endogenous or exogenous, confer a selection characteristic 
to the cells in which homologous recombination has 
occurred (i.e. neo*, tk"^ or Hprt""). Further, for those gene 
sequences which confer a selectable phenotype upon 
homologous recombination (e.g. the Hprt gene), the forma- 
tion of such a selectable phenotype requires the dismption of 
the endogenous gene. 

The foregoing approaches to gene targeting are clearly not 
applicable to many emerging technologies. See, e.g. Fried- 
man, T. (1989), Science, 244, 1275-1281 (human gene 
therapy); Gasser, C. S., et al.. Id., 1293-1299 (genetic 
engineering of plants); Pursel, I. G., et al., Id.. 1281-1288 
(genetic engineering of livestock); and Timberlake, W. E., et 
al.. Id. et al., 13 — 13, 1312 (genetic engineering of filamen- 
tous fimgi). Such techniques are generally not useful to 
isolate transformants wherein non-selectable endogenous 
genes are disrupted or modified by homologous recombina- 
tion. Ibe above methods are also of little or no use for gene 
therapy because of the difSculty in selecting cells wherein 
the genetic defect has been corrected by way of homologous 
recombination. 

Recently, several laboratories have reported the expres- 
sion of an expression-defective exogenous selection maricer 
after homologous integration into the genome of mammalian 
cells. Sedivy, J. M., et al. (1989), Proc. Nat. Acad. Sci 
U.S.A., 86, 227-231, report targeted disruption of the hem- 
izygous polyomavirus middle-T antigen with a neomycin 
resistance gene lacking an initiation codon. Successful trans- 
formants were selected for resistance to G418. Jasin, M., et 
al. (1988), Genes and Development, 2, 1353-1363 report 
integration of an expression-defective gpt gene lacking the 
enhancer in its SV40 early promoter into the SV40 early 
region of a gene akeady integrated into the mammalian 
genome. Upon homologous recombination, the defective gpt 
gene acts as a selectable marker. 

Assays for detecting homologous recombination have 
also recently been reported by several laboratories. Kim, H. 
S., et al. (1988), Nucl. Acid. S. Res., 16, 8887-8903, report 
the use of the polymerase chain reaction (PGR) to identify 
the disruption of the mouse hprt gene. A similar strategy has 
been used by others to identify the disruption of the Hox 1.1 
gene m mouse ES cells (Zinmuner, A. R, et al. (1989), 
Nature, 338, 150-153) and the disruption of the En-2 gene 
by homologous recombniation in embryonic stem cells. 
(Joyner, A. L., et al. (1989), Nature, 338, 153-156). 

It is an object herein to provide methods whereby any 
predetermined region of the genome of a cell or organism 
may be modified and wherein such modified cells can be 
selected and enriched. 

It is a further object of the invention to provide novel 
vectors used in practicing the above methods of the inven- 
tion. 

Still fiirtfaer, an object of the invention is to provide 
transformed cells which have been modified by the methods 
and vectors of the invention to contain desired mutations in 
specific regions of the genome of the cell. 

Further, it is an object herein to provide non-human 
transgenic organisms, which contain cells having predeter- 
mined genomic modifications. 

Hie references discussed above are provided solely for 
their disclosure prior to the filing date of the present appli- 
cation. Nothing herein is to be construed as an admission 
that the inventors are not entitled to antedate such disclosure 
by virtue of prior invention. 
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SUMMARY OF THE INVENTION 

In accordance with the above objects, positive-negative 
selector (PNS) vectors are provided for modifying a target 
DNA sequence contained in the genome of a target cell 
capable of homologous recombination. The vector com- 
prises a first DNA sequence which contains at least one 
sequence portion which is substantially homologous to a 
portion of a first region of a target DNA sequence. The 
vector also includes a second DNA sequence containing at 
least one sequence portion which is substantially homolo- 
gous to another portion of a second region of a target DNA 
sequence. A third DNA sequence is positioned between the 
first and second DNA sequences and encodes a positive 
.selection marker which when expressed is functional in the 
target cell in which the vector is used. A fourth DNA 
sequence encoding a negative selection marker, also func- 
tional in the target cell, is positioned 5' to the first or 3' to the 
second DNA sequence and is substantially incapable of 
homologous recombination with the target DNA sequence. 

The above PNS vector containing two homologous por- 
tions and a positive and a negative selection marker can be 
used in the methods of the invention to modify target DNA 
sequences. In this method, cells are first transfected with the 
above vector. During this transformation, the PNS vector is , 
most frequently randomly integrated into the genome of the 
cell. In this case, substantially all of the PNS vector con- 
taining the first, second, third and fourth DNA sequences is 
inserted into the genome. However, some of the PNS vector 
is integrated into the genome via homologous recombina- , 
tion. When homologous recombination occurs between the 
homologous portions of the first and second DNA sequences 
of the PNS vector and the corresponding homologous por- 
tions of the endogenous target DNA of the cell, the fourth 
DNA sequence containing the negative selection marker is , 
not incorporated into the genome. This is because the 
negative selection marker lies outside of the regions of 
homology in the endogenous target DNA sequence. As a 
consequence, at least two cell populations are formed. That 
cell population wherein random integration of the vector has 
occumed can be selected against by way of the negative 
selection marker contained in the fourth DNA sequence. 
This is because random events occur by integration at the 
ends of linear DNA. The other ceU population wherein gene 
taigeting has occurred by homologous recombination are ^ 
positively selected by way of the positive selection marker 
contained in the third DNA sequence of the vector. This cell 
population does not contain the negative selection marker 
and thus survives the negative selection. The net effect of 
this positive-negative selection method is to substantially , 
enrich for transformed cells containing a modified target 
DNA sequence. 

If in the above PNS vector, the thkd DNA sequence 
containing the positive selection marker is positioned 
between first and second DNA sequences corresponding to < 
DNA sequences encoding a portion of a polypeptide (e.g. 
within the exon of a eucaryotic organism) or within a 
regulatory region necessary for gene expression, homolo- 
gous recombination allows for the selection of cells wherein 
the gene containing such target DNA sequences is modified < 
such that it is non functional. 

If, however, the positive selection marker contained in the 
third DNA sequence .of the PNS vector is positioned within 
an imtranslated region of the genome, e.g. within an intron 
in a eucaryotic gene, modifications of the surrounding target e 
sequence (e.g. exons and/or regulatory regions) by way of 
substitotion, insation and/or deletion of one or more nucle- 



otides may be made without eliminating the functional 
character of the target gene. 

The invention also includes transformed cells containing 
at least one predetermined modification of a target DNA 
sequence contained in the genome of the cell. 

In addition, the invention includes organisms such as 
non-himian transgenic animals and plants which contain 
cells having predetermined modifications of a target DNA 
sequence in the genome of the organism. 

Various other aspects of the invention will be apparent 
from the following detailed description, appended drawings 
and claims. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 depicts the positive-negative selection (PNS) vec- 
tor of the invention and a target DNA sequence. 

FIGS. 2A and 2B depict the results of gene targeting 
(homologous recombination) and random integration of a 
PNS vector into a genome respectively. 

FIG. 3 depicts a PNS vector containing a positive selec- 
tion marker within a sequence corresponding, in part, to an 
intron of a target DNA sequence. 

FIG. 4 is a graphic representation of the absolute fre- 
quency of homologous recombination versus the amount of 
100% sequence homology in the first and second DNA 
sequences of the PNS vectors of the invention. 

FIGS. 5A, 5B, 5C and SD depict the construction of a 
PNS vector used to disrupt the INT-2 gene. 

FIG. 6 depicts the construction of a PNS vector used to 
disrupt the H0X1.4 gene. 

HGS. 7A, 7B and 7C depict the construction of a PNS 
vector used to transform endothelial cells to express factor 

vin. 

HG. 8 depicts a PNS vector to correct a defect in the 
purine nucleoside phosphorylase gene. 

HG. 9 depicts a vector for promoterless PNS. 

HG. 10 depicts the construction of a PNS vector to target 
an inducible promoter into the int-2 locus. 

DETAILED DESCRIPTION OF THE 
INVENTION 

The positive-negative selection ("PNS") methods and 
vectors of the invention are used to modUy target DNA 
sequences in the genome of cells capable of homologous 
recombination. 

A schematic diagram of a PNS vector of the invention is 
shown in FIG. 1. As can be seen, the PNS vector comprises 
four DNA sequences. The first and second DNA sequences 
each contain portions which are substantially homologous to 
corresponding homologous portions in first and second 
regions of the targeted DNA. Substantial homology is nec- 
essary between these portions in the PNS vector and the 
target DNA to insure targeting of the PNS vector to the 
^propriate region of the genome. 

As used herein, a "target DNA sequence" is a predeter- 
mined region vrithin the genome of a cell which is targeted 
for modification by the PNS vectors of the invention. Target 
DNA sequences include structural genes (i.e., DNA 
sequences encoding polypeptides including in the case of 
eucaryots, introns and exons), regulatory sequences such as 
enhancers sequences, promoters and the like and other 
regions withm the genome of interest. A target DNA 
sequence may also be a sequence which, when targeted by 



a vector has no effect on the function of the host genome. 
Generally, the target DNA contains at least first and second 
regions. See FIG. 1. Each region contains a homologous 
sequence portion which is used to design the PNS vector of 
the invention. In some instances, the target DNA sequence ^ 
also includes a third and in some cases a third and fourth 
region. The third and fourth regions are substantially con- 
tiguous with the homologous portions of the first and second 
region. The homologous portions of the target DNA are 
homologous to sequence portions contained in the PNS 
vector. The third and in some cases third and fourth regions 
define genomic DNA sequences within the target DNA 
sequaice which are not substantially homologous to the 
fourth and in some cases fourth and fifth DNA sequences of 
the PNS vector. 

Also included in the PNS vector are third and fourth DNA 
sequences which encode respectively "positive" and "nega- 
tive" selection markers. Examples of preferred positive and 
negative selection markers are listed in l^ble I. Hie third 20 
DNA sequence encoding the positive selection marker is 
positioned between the first and second DNA sequences 
while the fourth DNA sequence encoding the negative 
selection marker is positioned either 3' to the second DNA 
sequences shown in FIG. 1, or 5' to the first DNA sequence 25 
(not shown in FIG. 1). The positive and negative selection 
markers are chosen such diat they are functional in the ceils 
containing the target DNA. 

Positive and/or negative selection markers are "func- 
tional" in transformed cells if the phenotype expressed by 30 
the DNA sequences encoding such selection markers is 
capable of conferring either a positive or negative selection 
characteristic for the cell expressing that DNA sequence. 
Thus, "positive selection" comprises contacting cdls trans- 
fected with a PNS vector with an appropriate agent which 35 
kills or otherwise selects agamst cells not containing an 
integrated positive selection marker. "Negative selection" on 
the other hand comprises contacting cells transfected with 
the PNS vector widi an appropriate agent which kills or 
otherwise selects against cells containing the negative selec- 40 
tion marker. Appropriate agents for use with specific posi- 
tive and negative selection markers and appropriate concen- 
trations are listed m Table I. Other positive selection markers 
include DNA sequences encoding membrane bound 
polypeptides. Such polypeptides are well known to those 45 
skilled in the art and contain a secretory sequence, an 
extracellular domain, a transmembrane domain and an intra- 
cellular domain. When expressed as a positive selection 
marker, such polypeptides associate with the target cell 
membrane. Fluorescently labelled antibodies specific for the 50 
extracellular domain may then be used in a fluoresence 
activated cell sorter (FACS) to select for cells expressing the 
membrane bound polypeptide. FACS selection may occur 
before or after negative selection. 

55 

■ TABLE I 

Selectable Maricers for Use in PNS-Vectors 



TABLE I-continued 



Selectable Markers for Use in PNS-Vectors 



Histidinol 5^00 Animals 

Xanthine. 50-500 Animals 

Bleomycin 1-100 Plants 

Hypoxantbine 0.01-10 All 

Acyclovir 1-100 Animals 

HM 

Gancyclovir 0.05-200 Animals 

HM 

FIAU 0.02-100 Animals 



1-100 



All 



5-500 Plants 
(ig/ml 

10-1000 Eokaryotes 



The expression of the negative selection marker in the fourth 
DNA sequence is generally under control of appropriate 
regulatory sequences which render its expression in the 
target cell independent of the expression of other sequences 
in the PNS vector or the target DNA. The positive selection 
marker in the thkd DNA, however, may be constructed so 
that it is independently expressed (eg. when contained in an 
intron of the target DNA) or constructed so that homologous 
recombination will place it under control of regulatory 
sequences in the target DNA sequence. The strategy and 
details of the expression of the positive selection marker will 
be discussed in more detail hereinafter. 

The positioning of the negative selection marker as being 
either "5"" or "3"' is to be understood as relating to the 
positioning of the negative selection marker relative to the 5' 
or 3' end of one of the strands of the double-stranded PNS 
vector. This should be apparent from FIG. 1. The positioning 
of the various DNA sequences within the PNS vector, 
however, does not require that each of the four DNA 
sequences be transcriptionally and translationally aligned on 
a single strand of the PNS vector. Thus, for example, the first 
and second DNA sequences may have a 5' to 3' orientation 
consistent with the 5' to 3' orientation of regions 1 and 2 in 
the target DNA sequraice. When so aligned, the PNS vector 
is a "replacement PNS vector" upon homologous recombi- 
nation the replacement PNS vector replaces the genomic 
DNA sequence between the homologous portions of the 
target DNA with the DNA sequences between the homolo- 
gous portion of the first and second DNA sequences of the 
PNS vector. Sequence replacement vectors are preferred in 
practicing the invention. Alternatively, the homologous por- 
tions of the first and second DNA sequence in the PNS 
vector may be inverted relative to each other such that the 
homologous portion of DNA sequence 1 corresponds 5' to 3' 
with the homologous portion of region 1 of the target DNA 
sequence whereas the homologous portion of DNA sequence 
2 in the PNS vector has an orientation which is 3" to 5' for 
the homologous portion of the second region of the second 
region of the target DNA sequence. This inverted orientation 
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provides for and "insertion PNS vector". When an insertion 
PNS vector is homologously inserted into the target DNA 
sequence, the entire PNS vector is inserted into the target 
DNA sequence without replacing the homologous portions 
in the target DNA. The modified target DNA so obtained 5 
necessarily contains the duplication of at least those homolo- 
gous portions of the target DNA which are contained in the 
PNS vector. Sequence replacement vectors and sequence 
insertion vectors utilizing a positive selection marker only 
are described by Thomas et al. (1987), Cell, 51, 503-512. lo 

Similarly, the third and fourth DNA sequences may be 
transcriptionally inverted relative to each other and to the 
transcriptional orientation of the target DNA sequence. This 
is only the case, however, when expression of the positive 
and/or negative selection marker in the third and/or fourth 15 
DNA sequence respectively is independently controlled by 
appropriate regulatory sequences. When, for example a 
promoterless positive selection marker is used as a third 
DNA sequence such that its expression is to be placed under 
control of an endogenous regulatory region, such a vector 20 
requires that the positive selection marker be positioned so 
that it is in proper alignment (5' to 3' and proper reading 
frame) with the transcriptional orientation and sequence of 
the endogenous regulatory region. 

Positive-negative selection requires that the fourth DNA 25 
sequence encoding the negative marker be substantially 
incapable of homologous recombination with the target 
DNA sequence. In particular, the fourth DNA sequence 
should be substantially non-homologous to a third region of 
the target DNA. When the fourth DNA sequence is posi- 30 
tioned 3' to the second DNA sequence, the fourth DNA 
sequence is non-homologous to a third region of the target 
DNA which is adjacent to the second region of the target 
DNA. See FIG. 1. When the fourth DNA sequence is located 
5' to the first DNA sequence, it is non-homologous to a 3S 
fourth region of the target DNA sequence adjacent to the first 
region of the taiget DNA. 

In some cases, the PNS vector of the invention may be 
constructed with a fifth DNA sequence also encoding a 
negative selection marker. In such cases, the fifth DNA 40 
sequence is positioned at the opposite end of the PNS vector 
to that containing the fourth DNA sequence. The fourth 
DNA sequence is substantially non-homologous to the third 
region of the target DNA and the fifth DNA sequence is 
substantially non-homologous to the fourth region of the 4S 
target DNA. The negative selection markers contained in 
such a PNS vector may either be the same or different 
negative selection markers. When they are different such 
that they require the use of two different agents to select 
again cells containing such negative markers, such negative so 
selection may be carried out sequentially or simultaneously 
with appropriate agents for the negative selection marker. 
The positioning of two negative selection markers at the 5' 
and 3 ' end of a PNS vector further enhances selection against 
target cells which have randomly integrated the PNS vector. 55 
This is because random integration sometimes results in the 
rearrangement of the PNS vector resulting in excision of all 
or part of the negative selection marker prior to random 
integration. When this occurs, cells randomly integrating the 
PNS vector cannot be selected against. However, the pres- 60 
ence of a second negative selection marker on the PNS 
vector substantially enhances the likelihood that random 
integration will result in the insertion of at least one of the 
two negative selection markers. 

The substantial non-homology between the fourth DNA 65 
sequence (and in some cases fourth and fifth DNA 
sequences) of the PNS vector and the target DNA creates a 
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discontinuity in sequence homology at or near the juncture 
of the fourth DNA sequence. Thus, when the vector is 
integrated into the genome by way of the homologous 
recombination mechanism of the cell, the negative selection 
marker in the fourth DNA sequence is not transfeired into 
the target DNA. It is the non-integration of this negative 
selection marker during homologous recombination which 
forms the basis of the PNS method of the invention. 

As used herein, a "modifying DNA sequence" is a DNA 
sequence contained in the first, second and/or third DNA 
sequence which encodes the substitution, insertion and/or 
deletion of one or more nucleotides ia the target DNA 
sequence after homologous insertion of the PNS vector into 
the targeted region of the genome. When the PNS vector 
contains only the insertion of the third DNA sequence 
encoding the positive selection marker, the third DNA 
sequence is sometimes referred to as a "first modifying DNA 
sequence". When in addition to the third DNA sequence, the 
PNS vector also encodes the fiirther substitution, insertion 
and/or deletion of one or more nucleotides, that portion 
encoding such further modification is sometimes referred to 
as a "second modifying DNA sequence". The second modi- 
fying DNA sequence may comprise the entire first and/or 
second DNA sequence or in some instances may comprise 
less than the entire first and/or second DNA sequence. The 
latter case typically arises when, for example, a heterologous 
gene is incorporated into a PNS vector which is designed to 
place that heterologous gene under the regulatory control of 
endogenous regulatory sequences. In such a case, the 
homologous portion of, for example, the first DNA sequence 
may comprise aU or part of the targeted endogenous regu- 
latory sequence and the modifying DNA sequence com- 
prises that portion of the first DNA sequence (and in some 
cases a part of the second DNA sequence as well) which 
encodes the heterologous DNA sequence. An appropriate 
homologous portion in the second DNA sequence will be 
included to complete the targeting of the PNS vector. On the 
other hand, the entire first and/or second DNA sequence may 
comprise a second modifymg DNA sequence when, for 
example, either or both of these DNA sequences encode for 
the correction of a genetic defect in the targeted DNA 
sequence. 

As used herein, "modified target DNA sequence" refers to 
a DNA sequence in the genome of a targeted cell which has 
been modified by a PNS vector. Modified DNA sequences 
contain the substitution, insertion and/or deletion of one or 
more nucleotides in a first transformed target cell as com- 
pared to the cells from which such transformed target cells 
are derived. In some cases, modified target DNA sequences 
are referred to as "first" and/or "second modified target DNA 
sequences". These correspond to the DNA sequence found 
in the transformed target cell when a PNS vector containing 
a first or second modifying sequence is homologously 
integrated into the taiget DNA sequence. 

'Transformed target cells" sometimes referred to as "first 
transformed target cells" refers to those target cells wherein 
the PNS vector has been homologously integrated into the 
target cell genome. A "transformed ceU" on the other hand 
refers to a cell wherein the PNS has non-homologously 
inserted into the gaiome randomly. 'Transformed target 
cells" generally contain a positive selection marker within 
the modified target DNA sequence. When the object of the 
genomic modification is to disrupt the expression of a 
particular gene, the positive selection marker is generally 
contained within an exon which effectively disrupts tran- 
scription and/or translation of the targeted endogenous gene. 
When, however, the object of the genomic modification is to 
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insert an exogenous gene or correct an endogenous gene 
defect, the modified target DNA sequence in the first trans- 
formed target cell will in addition contain exogenous DNA 
sequences or endogenous DNA sequences corresponding to 
those found in the normal, i.e., nondefective, endogenous 5 
gene. 

"Second transformed target cells" refers to first trans- 
formed target cells whose genome has been subsequently 
modified in a predetermined way. For example, the positive 
selection marker contained in liie genome of a first trans- lo 
formed target cell can be excised by homologous recombi- 
nation to produce a second transformed target ceH. The 
details of such a predetermined genomic manipulation will 
be described in more detail hereinafter. 

As used herein, "heterologous DNA" refers to a DNA is 
sequence which is different from that sequence comprising 
the target DNA sequence. Heterologous DNA differs from 
target DNA by the substitution, insertion and/or deletion of 
one or more nucleotides. Thus, an endogenous gene 
sequence may be incorporated into a PNS vector to target its 20 
insertion into a different regulatory region of the genome of 
the same organism. The modified DNA sequence so 
obtained is a heterologous DNA sequence. Heterologous 
DNA sequences also include endogenous sequences which 
have been modified to correct or introduce gene defects or 25 
to change the amino add sequence encoded by the endog- 
enous gene. Further, heterologous DNA sequences include 
exogenous DNA sequences which are not related to endog- 
enous sequences, e.g. sequences derived from a different 
species. Such "exogenous DNA sequences" include those 30 
which encode exogenous polypeptides or exogenous regu- 
latory sequences. For example, exogenous DNA sequences 
which can be introduced into murine or bovine ES cells for 
tissue specific expression (e.g. in mammary secretory cells) 
include human blood factors such as t-PA, Factor Vm, 35 
serum albumin and the like. DNA sequences encoding 
positive selection markers are further examples of heterolo- 
gous DNA sequences. 

The PNS vector is used in the PNS method to select for 
transformed target cells containing the positive selection 40 
marker and against those transformed cells containing the 
negative selection marker. Such positive-negative selection 
procedures substantially enrich for those transformed target 
cells wherein homologous recombination has occurred. As 
used herein, "substantial enrichment" refers to at least a 45 
two-fold enrichment of transformed target cells as compared 
to the ratio of homologous transformants versus nonhomolo- 
gous transformants, preferably a 10-fold enrichment, more 
preferably a 1000-fold enrichment, most preferably a 
10,000-fold enrichment, i.e., the ratio of transformed target 50 
cells to transformed cells. In some instances, the frequency , 
of homologous recombination versus random integration is 
of the order of 1 in 1000 and in some cases as low as 1 in 
10,000 transformed cells. The substantial enrichment 
obtained by the PNS vectors and methods of the invention 55 
often result in cell populations wherein about 1%, and more 
preferably about 20%, and most preferably about 95% of the 
resultant cell population contains transformed target cells 
wherein the PNS vector has been homologously integrated. 
Such substantially enriched transformed target cell popula- 60 
tions may thereafter be used for subsequent genetic manipu- 
lation, for cell culture experiments or for the .production of 
transgenic organisms such as transgenic animals or plants. 

FIGS. 2a and 2b show the consequences of gene targeting 
(homologous recombination) and random integration of a 65 
PNS vector into the genome of a target cell. The PNS vector 
shown contains a neomycin resistance gene ais a positive 
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selection marker (neoO and a herpes simplex virus thymi- 
dine kinase (HSV-tk) gene as a negative selection marker. 
The neo"" positive selection marker is positioned in an exon 
of gene X. This positive selection marker is constructed such 
that it's expression is under the independent control of 
appropriate regulatory sequences. Such regulatory 
sequences may be endogenous to the host cell in which case 
they are preferably derived from genes actively expressed in 
the cell type. Alteratively, such regulatory sequences may be 
inducible to permit selective activation of expression of the 
positive selection marker. 

On each side of the neo'' marker are DNA sequences 
homologous to the regions 5' and 3' from the point of neo'^ 
insertion in the exon sequence. These flanking homologous 
sequences target the X gene for homologous recombination 
with the PNS vector. Consistent with the above description 
of the PNS vector, the negative selection marker HSV-tk is 
situated outside one of the regions of homology. In this 
example it is 3' to the transcribed region of gene X. The neo"" 
gene confers resistance to the drug G418 (G418*) whereas 
the presence of the HSV-tk gene renders cells containing this 
gene sensitive to gancyclovir (GANG*). When the PNS 
vector is randomly inserted into the genome by a mechanism 
other than by homologous recombination (FIG. 2b), inser- 
tion is most frequently via the ends of the linear DNA and 
thus the phenotype for such cells is neo^ HSV-tk* (G418*, 
GANG*). When the PNS vector is incorporated into the 
genome by homologous recombination as in FIG. 2a, the 
resultant phenotype is neo+, HSV-tk-(G418*, GANC). 
Thus, those cells wherein random integration of the PNS 
vector has occurred can be selected against by treatment 
with GANG. Those remaining transformed target cells 
wherein homologous recombination has been successful can 
then be selected on the basis of neomycin resistance and 
GANG resistance. It, of course, should be apparent that the 
order of selection for and selection against a particular 
genotype is not important and that in some instances positive 
and negative selection can occur simultaneously. 

As indicated, the neomycin resistance gene in FIG. 2 is 
incorporated into an exon of gene X. As so constructed, the 
integration of the PNS vector by way of homologous recom- 
bination effectively blocks the expression of gene X. In 
multicellular organisms, however, integration is predomi- 
nantly random and occurs, for the most part, outside of the 
region of the genome encoding gene X. Non-homologous 
recombination therefore will not disrupt gene X in most 
instances. The resultant phenotypes will therefore, in addi- 
tion to the foregoing, will also be X~ for homologous 
recombination and X* for random integration. In many cases 
it is desirable to disrupt genes by positioning the positive 
selection marka in an exon of a gene to be disrupted or 
modified. For example, specific proto-oncogenes can be 
mutated by this method to produce transgenic animals. Such 
transgenic animals containing selectively inactivated proto- 
oncogenes are usefiil in dissecting the genetic contribution 
of such a gene to oncogenesis and in some cases normal 
development. 

Another potential use for gene inactivation is disruption 
of proteinaceous receptors on cell surfaces. For example, 
cell lines or organisms wherein the expression of a putative 
viral receptor has been disrupted using an appropriate PNS 
vector can be assayed with virus to confirm that the receptor 
is, in fact, involved in viral infection. Further, appropriate 
PNS vectors may be used to produce transgenic animal 
models for specific genetic defects. For example, many gene 
defects have been characterized by the failure of specific 
genes to express functional gene product, e.g. a and p 
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thalassema, hemophilia, Gaucher's disease and defects 
affecting the production of a-l-antitrypsin, ADA, PNP, 
phenylketonurea, familial hypercholesterolemia and retino- 
blastemia. Transgenic animals containing disruption of one 
or both alleles associated with such disease states or modi- 5 
fication to encode the specific gene defect can be used as 
models for therapy. For those animals which are viable at 
birth, experimental therapy can be applied. When, however, 
the gene defect affects survival, an appropriate generation 
(e.g. FO, Fl) of transgenic animal may be used to study in 10 
vivo techniques for gene therapy. 

A modification of the foregoing means to disrupt gene X 
by way of homologous mtegration involves the use of a 
positive selection marker which is defident in one or more 
regulatory sequences necessary for expression. The PNS 15 
vector is constructed so that part but not aU of the regulatory 
sequences for gene X are contained in the PNS vector 5* 
from the structural gene segment encoding the positive 
selection maricer, e.g., homologous sequences encoding part 
of the promoter of the X gene. As a consequence of this 20 
construction, the positive selection marker is not functional 
in the target cell until such time as it is homologously 
integrated into the promotor region of gene X. When so 
integrated, gene X is disrupted and such cells may be 
selected by way of the positive selection marker expressed 25 
under the control of the target gene promoter. The only 
limitation in using such an approach is the requirement that 
the targeted gene be actively expressed in the cell type used. 
Otherwise, the positive selection marker will not be 
expressed to confer a positive selection characteristic on the 30 
cell. 

In many instances, the disruption of an endogenous gene 
is undesirable, e.g., for some gene therapy applications. In 
such situations, the positive selection marker comprising the 
third DNA sequence of the PNS vector may be positioned 35 
within an untranslated sequence, e.g. an intron of the target- 
DNA or 5' or 3" untranslated regions. FIG. 3 depicts such a 
PNS vector. As indicated, the first DNA sequence comprises 
part of exon I and a portion of a contiguous intron in the 
target DNA. The second DNA sequence encodes an adjacent 40 
portion of the same intron and optionally may include all or 
a portion of exon H. The positive selection marker of the 
third DNA sequence is positioned between the first and 
second sequences. The fourth DNA sequence encoding the 
negative selection marker, of course, is positioned outside of 45 
the region of homology. When the PNS vector is integrated 
into the target DNA by way of homologous recombination 
the positive selection marker is located in the intron of the 
targeted gene. The third .DNA sequence is constructed such 
that it is capable of being expressed and translated indepen- 50 
dently of the targeted gene. Thus, it contains an independent 
functional promotor, translation initiation sequence, trans- 
lation termination sequence, and in some cases a polyade- 
nylation sequence and/or one or more enhancer sequences, 
each functional in the cell type transfected with the PNS 55 
vector. In this manner, cells incorporating the PNS vector by 
way of homologous recombination can be selected by way 
of the positive selection marker without disruption of the 
endogenous gene. Of course, the same regulatory sequences 
can be used to control the expression of the positive selec- 60 
tion marker when it is positioned within an exon. Further, 
such regulatory sequences can be used to control expression 
of the negative selection marker. Regulatory sequences 
useful in controlling the expression of positive and/or nega- 
tive selection markers are listed in Table IIB. Of course, 65 
other regulatory sequences may be used which are known to 
those skilled in the art. In each case, the regulatory 
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sequences will be properly aligned and, if necessary, placed 
in proper reading frame with the particular DNA sequence to 
be expressed. Regulatory sequence, e.g. enhancers and pro- 
moters from different sources may be combined to provide 
J ggjjg expression. 



TABLE nA 

Tissue Specific Regulatory Sequences 
Promoter/Enhancer Reference 



PNMT 
P-globin 



(exocrine) 
Ktuitaiy 



a-FP 
WAP 



Baetge.etal.(1988) 
WAS 85 

Ibwnes et aL (1985) 
EMBO J 4:1715 
Overteek et aL (1985) 
PNAS 82:7815 
Krumlauf etal. (1985) 
MCB 5:1639 
Yaimmura et al. (1986) 
mAS 83:2152 
Gordon etal. (1987) 
Biom^ 5:1183 
l^ora et al. (1989) 
MCB 9:3122 
Hanaban (1985) Nature 
315:115 

Swift et al. (1984) 
CeU 38:639 
Ingrahametal. (1988) 
Cell 55:579 
Johnson et al. (1989) 
MCB 9:3393 
Stewart et al. (1988) 
MCB 8:1748 



PYF441 
(pMa-Neoi 
ASV-LTR 
SV-40 early 



fibroblasts 
variety of : 



haemopoetic stem ceUs 



protoplasts 



A modification of the target DNA sequence is also shown 
in FIG. 3. In exon I of the target DNA sequence, the sixth 
codon GTG is shown which encodes valine. In the first DNA 
sequence of the PNS vector, the codon GAG replaces the 
GTG codon in exon I. This latter codon encodes glutamine. 
Cells selected for homologous recombination as a conse- 
quence encode a modified protein wherein the amino acid 
encoded by the sixth codon is changed from valine to 
glutamine. 

There are, of course, numerous other examples of modi- 
fications of target DNA sequences in the genome of the cell 
which can be obtained by the PNS vectors and methods of 
the invention. For example, endogenous regulatory 
sequences controlling the expression of proto-oncogenes 
can be replaced with regulatory sequences such as promoters 
and/or enhancers which actively express a particular gene in 
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a specific cell type in an oiganism, i.e., tissue-specific 
legulatoiy sequences. In this manner, the expression of a 
proto-oncogene in a particular cell type, for example in a 
transgenic animal, can be controlled to deteimine the effect 
of oncogene expression in a cell type which does not 5 
normally express the proto-oncogene. Alternatively, known 
viral oncogenes can be inserted into specific sites of the 
target genome to bring about tissue-specific expression of 
the viral oncogene. Examples of preferred tissue-specific 
regulatory sequences are listed in Table HA. Examples of lo 
proto-oncogenes which may be modified by the PNS vectors 
and methods to produce tissue specific expression and viral 
oncogenes which may be placed under control of endog- 
enous regulatory sequences are listed in THable EUA and IHB, 
respectively. is 

TABLE mA 

Proto-oncogenes involved in human tumors 
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c-abi 


chronic myelogenous leukemia 




squamous cell carcinoma 




glial blastoma 




Burkitt's lymphoma 




small cell carcinoma of lung 




caidnDmaofbiBat 




small cdl caiciiioma of lung 




small cell carcinoma of hmg 
neuroblastoma 










TABLE niB 



sctopically expressed in mice 



HPV-E6 
HPV-E7 
PyTag 



As indicated, the positive-negative selection methods and 
vectors of the invention are used to modify target DNA 
sequences in the genome of target cells capable of homolo- 
gous recombination. Accordingly, the invention may be 45 
practiced with any cell type which is capable of homologous 
recombination. Examples of such target cells include cells 
doived irom vertebrates including mammals such as 
humans, bovine species, ovine species, murine species, 
simian species, and other eucaryotic organisms such as 50 
filamentous fungi, and higher multicellular organisms such 
as plants. The invention may also be practiced with lower 
organisms such as gram positive and gram negative bacteria 
capable of homologous recombination. However, such 
lower otganisms are not preferred because they generally do 55 
not demonstrate significant non-homologous recombination, 
i.e., random integration. Accordingly, there is little or no 
need to select against non-homologous transformants. 

In those cases where the ultimate goal is the production of 
a non-human transgenic animal, embryonic stem cells (ES 60 
cells) are preferred target cells. Such cells have been 
manipulated to introduce transgenes. ES cells are obtained 
firom pre-implantation embryos cultured in vitro. Evans, M. 
J., et al. (1981), Nature, 292, 154-156; Bradley, M. O., et al. 
(1984), Nature, 309. 255-258; Gossler, et al. (1986), Proc. 65 
NatL Acad. Sd. V.S.A, 83. 9065-9069; andRobertson, et al. 
(1986), Nature, 322. 445-448. PNS vectors can be effi- 



ciently introduced into the ES cells by electroporation or 
microinjection or other transformation methods, preferably 
dectroporation. Such transformed ES cells can thereafter be 
combined with blastocysts from a non-human animal. The 
ES cells thereafter colonize the embryo and can contribute 
to the germ line of the resulting chimeric animal. For review 
see Jaenisch, R. (1988), Science, 240, 1468-1474. In the 
present invention, PNS vectors are targeted to a specific 
portion of the ES cell genome and thereafter used to generate ' 
chimeric transgenic animals by standard techniques. 

When the ultimate goal is gene therapy to correct a 
genetic defect in an organism such as a human being, the cell 
type will be determined by the etiology of the particular 
dsease and how it is manifested. For example, hemopoietic 
stem cells are a preferred cells for correcting genetic defects 
in cell types which differentiate from such stem cells, e.g. 
erythrocytes and leukocytes. Thus, genetic defects in globin 
chain synthesis in erythrocytes such as sickle cell anemia, 
p-thalassemia and the like may be corrected by using the 
PNS vectors and methods of the invention with hematopoi- 
etic stem ceUs isolated from an affected patient. For 
example, if the target DNA in FIG. 3 is the sickle-cell 
p-globin gene contained in a hematopoietic stem cell and the 
PNS vector in FIG. 3 is targeted for this gene with the 
modification shown in the sixth codon, transformed hemato- 
poietic stem cells can be obtained wherein a normal P-globin 
will be expressed upon differentiation. After correction of 
the defect, the hematopoietic stem cells may be returned to 
the bone marrow or systemic circulation of the patient to 
form a subpopulation of erythrocytes containing normal 
hemoglobin. Alternatively, hematopoietic stem cells may be 
destroyed in the patient by way of irradiation and/or che- 
motherapy prior to reintroduction of the modified hemato- 
poietic stem cell thereby completely rectifying the defect 

Other types of stem cells may be used to correct the 
specific gene defects associated with cells derived from such 
stem cells. Such other stem cells include epithelial, liver, 
limg, muscle, endothelial, menchymal, neural and bone stem 
cells. Table IV identifies a number of known genetic defects 
which are amenable to correction by the PNS methods and 
vectors of the invention. 

Alternatively, certain disease states can be treated by 
modifying the genome of cells in a way which does not 
correct a genetic defect per se but provides for the supple- 
mentation of the gene product of a defective gene. For 
example, endothelial cells are preferred as targets for human 
gene therapy to treat disorders affecting factors normally 
present in the systemic circulation. In model stodies using 
both dogs and pigs endothelial cells have been shown to 
form primary cultures, to be transformable with DNA in 
culture, and to be capable of expressing a transgene upon 
re-implantation in arterial grafts into the host organism. 
Wilson, et al. (1989), Science, 244. 1344; Nabel, et al. 
(1989), Science, 244, 1342. Since endothelial cells form an 
integral part of the graft, such U:ansformed cells can be used 
to produce proteins to be secreted into the circulatory system 
and thus serve as therapeutic agents in the treatment of 
genetic disorders affecting circulating factors. Examples of 
such diseases include insulin-deficient diabetes, a-l-antit- 
rypsin deficiency, and hemophilia. Epithelial cells provide a 
particular advantage in the treatment of factor VIH-deficient 
hemophilia. These cells naturally produce von WOlebrand 
factor and it has been shown that production of active factor 
Vin is dependant upon the autonomous synthesis of vWF 
(Toole, et al. (1986), Proc. Natl. Acad. Sci. U.S.A., 83, 
5939). 

As indicated in Example 4, human endothelial cells firom 
a hemophiliac patient deficient in Factor VUI are modified 
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by a PNS vector to produce an enriched population of 
transformed endothelial cells wherein the expression of 
DNA sequences encoding a secretory form of Factor Vm is 
placed under the control of the regulatory sequences of the 
endogenous p-actin gene. Such transformed cells are 5 
implanted into vascular grafts from the patient. After incor- 
poration of transformed cells, it is grafted back into the 
vascular system of the patient. The transfonned cells secrete 
Factor Xin into the vascular system to supplement the 
defect in the patients blood clotthig system. 

Other diseases of the immune and/or the circulatory 
system are candidates for human gene therapy. The target 
tissue, bone marrow, is readUy accessible by current tech- 
nology, and advances are being made in culturing stem cells 
in vitro. The immune deficiency diseases caused by muta- 
tions in the enzymes adenosine deaminase (ADA) and 
purine nucleotide phosphoiylase (PNP), are of particular 
interest. Not only have the genes been cloned, but cells 
corrected by PNS gene therapy are likely to have a selective 
advantage over their mutant counterparts. Thus, ablation of 
the bone marrovif in recipient patients may not be necessary. 

The PNS approach is applicable to genetic disorders with 
the following characteristics: first, the DNA sequence and 
preferably the cloned normal gene must be available; sec- 
ond, the appropriate, tissue relevant, stem cell or other 
appropriate cell must be available. Below is Table IV listing ^ 
some of the known genetic diseases, the name of the cloned 
gene, and the tissue type in which therapy may be appro- 
priate. These and other genetic disease amenable to the PNS 
methods and vectors of the invention have been reviewed. 
See Friedman (1989), Science, 244, 1275; Nichols, E. K. 
(1988), Human Gene Therapy (Harvard University Press); 
and Cold Springs Harbor Symposium on QuamitaUve Biol- 
ogy, Vol. 11 (1986), 'The Biology of Homo Sapiens" (Cold 
Springs Harbor Press). 

35 

TABLE IV 
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Gaucher Disease glncocerebrosidase 
Granulocyte Actin Granulocyte Actin 

Deficiency 



Dystrophy 
Pheaylketonu 

Sickle on 



1, genetic defects may be corrected in specific 1 
cell lines by positioning the positive selection marker (the 
second DNA sequence in the PNS vector) in an untranslated 
region such as an intron near the site of the genetic defect 
together with flanking segnlents to correct the defect. In this 
approach, the positive selection marker is under its own ( 
regulatory control and is capable of expressing itself without 
substantially interfering with the expression of the targeted 



gene. In the case of human gene therapy, it may be desirable 
to introduce only those DNA sequences which are necessary 
to correct die particular genetic defect. In this regard, it is 
desirable, although not necessary, to remove the residual 
positive selection marker which remains after correction of 
the genetic defect by homologous recombination. 

The removal of a positive selection marker firom a 
genome in which homologous insertion of a PNS vector has 
occurred can be accomplished in many ways. For example, 
the PNS vector can iuclude a second negative selection 
marker contained within the third DNA sequence. This 
second negative selection marker is different from the first 
negative selection marker contained in the fourth DNA 
sequence. After homologous integration, a second modified 
target DNA sequence is formed containing the third DNA 
sequence encoding both the positive selection marker and 
the second negative selection marker. After isolation and 
purification of the first transformed target ceOs by way of 
negative selection against transformed cells containing the 
first negative selection marker and for those cells containing 
the positive selection marker, the first transformed target 
cells are subjected to a second cycle of homologous recom- 
bination. In this second cycle, a second homologous vector 
is used which contains all or part of the first and second DNA 
sequence of the PNS vector (encodmg the second modifi- 
cation in the target DNA) but not those sequences encoding 
the positive and second negative selection markers. The 
second negative selection marker in the first transformed 
target cells is then used to select against unsuccessful 
transformants and cells wherein the second homologous 
vector is randoroly integrated into the genome. Homologous 
recombination of this second homologous vector, however, 
with the second modified target DNA sequence results in a 
second transformed target cell type which does not contain 
i either the positive selection marker or the second negative 
selection marker but which retains the modification encoded 
by the first and/or second DNA sequences. Cells which have 
not homologously integrated the second homologous vector 
are selected against using the second negative selection 
I marker. 

The PNS vectors and methods of the invention are also 
applicable to the manipulation of plant cells and ultimately 
the genome of the entire plant. A wide variety of transgenic 
plants have been reported, including herbaceous dicots, 

I woody dicots and monocots. For a summary, see Gasser, et 
al. (1989), Science, 244, 1293-1299. A number of different 
gene transfer techniques have been developed for producing 
such transgenic plants and transformed plant cells. One 
technique used Agrobacterium tumefaciens as a gene trans- 

I fer system. Rogers, et al. (1986), Methods EmymoL, 118, 
627-640. A closely related transformation utilizes the bac- 
terium Agmtocfenum rtdzogenes. In each of these systems 
a li or Ri plant transformation vector can be constructed 
containing border regions which define the DNA sequence 
to be inserted into the plant genome. These systems previ- 
ously have been used to randomly integrate exogenous DNA 
to plant genomes. In the present invention, an appropriate 
PNS vector may be inserted into the plant transformation 
vector between the border sequences defining the DNA 

I sequences transferred into the plant cell by the Agrobacte- 
rium transformation vector. 

Preferably, the PNS vector of the invention is direcfly 
transferred to plant protoplasts by way of methods analo- 
gous to that previously used to introduce transgenes into 
protoplasts. See, e.g. Paszkowski, et al. (1984), EMBO J., 3, 
nn-nil; Hain, et al. (1985), Mol. Gen. Genet., 199, 
161-168; ShiUito, et al. (1985), BioJTechnology, 3, 
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1099-1103; and Negratiu, et al. (1987), Plant Mol. Bio., 8, 
363-373. Alternatively, the PNS vector is contained within 
a liposome which may be fused to a plant protoplast (see, 
e.g. Deshayes, et al. (1985), EMBO J., 4, 2731-2738) or is 
directly inserted to plant protoplast by way of intranuclear 5 
microinjection (see, e.g. Crossway. et al. (1986), Mol. Gen 
Genet., 202, 179-185, and Reich, et al. (1986), Bio/Tech- 
nology, 4, 1001-1004). Microinjection is the preferred 
method for transfecting protoplasts. PNS vectors may also 
be microinjected into meristematic inflorenscences. De la 
Pena et al. (1987), Nature, 325, 274-276. Finally, tissue 
explants can be transfected by way of a high velocity 
microprojectile coated with the PNS vector analogous to the 
methods used for insertion of transgenes. See, e.g. Vasil 
(1988), Bio/Technology, 6, 397; Klein, et al. (1987), Nature, 
327, 70; Klein, et al. (1988), Proc. NatL Acad. ScL U.S.A.. 15 
85, 8502; McCabe, et al. (1988), Bio/Technology, 6, 923; and 
Klein, et al.. Genetic Engineering. Vol 11, J. K. Setlow 
editor (Academic Press, N.Y., 1989). Sudi transformed 
explants can be used to regenerate for example various serial 
crops. Vasil (1988), Bio/Technology. 6, 397. 20 

Once the PNS vector has been inserted into the plant cell 
by any of the foregoing methods, homologous recombina- 
tion targets the PNS vector to the appropriate site in the plant 
genome. Depending upon the methodology used to transfect, 
positive-negative selection is performed on tissue cultures of 25 
the transformed protoplast or plant cell. In some instances, 
cells amenable to tissue culture may be excised from a 
transformed plant either from the FO or a subsequent gen- 
eration. 

The PNS vectors and method of the invention are used to 30 
precisely modify the plant genome in a predetermined way. 
Thus, for example, herbicide, insect and disease resistance 
may be predictably engineered into a specific plant species, 
to provide, for example, tissue specific resistance, e.g., 
insect resistance in leaf and bark. Altematively, the expres- 35 
sion levels of various components within a plant may be 
modified by substituting appropriate regulatory elements to 
change the fatty acid and/or oil content in seed, the starch 
content within the plant and the elimination of components 
contributing to undesirable flavors in food. Altematively, 40 
heterologous genes may be introduced into plants under the 
predetermined regulatory control in tiie plant to produce 
various hydrocarbons including waxes and hydrocarbons 
used in the production of rubber. 

Hie amino acid composition ofvarious storage protdns in 4S 
wheat and com, for example, which are known to be 
deficient in lysine and tryptophan may also be modified. 
PNS vectors can be readily designed to alter specific codons 
within such storage proteins to encode lysine and/or tryp- 
tophan thereby increasing the nutritional value of such 50 
crops. For example, the zein protein in com (Pederson et al. 
(1982), Cell. 29, 1015) may be modified to have a higher 
content of lysine and tryptophan by the vectors and methods 
of the invention. 

It is also possible to modify the levds of expression of 55 
various positive and negative regulatory elements control- 
ling the expression of particular proteins in various cells and 
organisms. Thus, the expression level of negative regulatory 
elements may be decreased by use of an appropriate pro- 
motor to enhance the expression of a partiodar protein or 60 
proteins under control of such a negative regulatory element. 
Altematively, the expression level of a positive regulatory 
protein may be increased to enhance expression of the 
regulated protein or decreased to reduce the amount of. 
regulated protein in the cell or organism. 65 

The basic elements of the PNS vectors of the invention 
have already been described. The selection of each of the 



20 

DNA sequences comprising the PNS vector, however, will 
depend upon the cell type used, the target DNA sequence to 
be modified and the type of modification which is desired. 

Preferably, the PNS vector is a linear double stranded 
DNA sequence. However, circular dosed PNS vectors may 
also be used. Linear vectors are preferred since they enhance 
the frequency of homologous integration into the target 
DNA sequence. Thomas, et al. (1986), Cell, 44, 49. 

In general, the PNS vector (including first, second, third 
and fourth DNA sequences) has a total length of between 2.5 
kb (2500 base pairs) and 1000 kb. The lower size limit is set 
by two criteria. The first of these is the minimum necessary 
length of homology between the first and second sequences 
of the PNS vector and the target locus. This minimum is 
approximately 500 bp (DNA sequence 1 plus DNA sequence 
2). The second criterion is the need for functional genes in 
the third and fourth DNA sequences of tiie PNS vector. For 
practical reasons, this lower limit is approximately 1000 bp 
for each sequence. This is because the smallest DNA 
sequences encoding known positive and negative selection 
markers are about 1.0-1.5 kb in lengfli. 

The upper limit to tiie length of the PNS vector is 
determined by the state of tiie technology used to manipulate 
DNA fragments. If these fragments are propagated as bac- 
terial plasmids, a practical upper lengtii limit is about 25 kb; 
if propagated as cosmids, tiie limit is about 50 kb, if 
propagated as YACs (yeast artificial chromosomes) the limit 
approaches 1000 kb (Burke, et al. (1987), Science, 236, 
806). 

Within the first and second DNA sequences of the PNS 
vector are portions of DNA sequence which are substantially 
homologous with sequence portions contained within the 
first and second regions of tiie target DNA sequence. The 
degree of homology between tiie vector and target sequences 
influences the j&equency of homologous recombination 
between flie two sequences. One hundred percent sequence 
homology is most preferred, however, lower sequence 
homology can be used to practice the invention. Thus, 
sequence homology as low as about 80% can be used. A 
practical lower limit to sequence homology can be defined 
functionally as that amount of homology which if further 
reduced does not mediate homologous integration of the 
PNS vector into tiie genome. Although as few as 25 bp of 
100% homology are required for homologous recombina- 
tion in mammalian cells (Ayares, et al. (1986), Genetics, 83, 
5199-5203), longer regions are preferred, e.g., 500 bp, more 
preferably, 5000 bp, and most preferably, 25000 bp for each 
homologous portion. These numbers define the limits of the 
individual lengths of the first and second sequences. Pref- 
erably, the homologous portions of the PNS vector will be 
100% homologous to the target DNA sequence, as increas- 
ing the amount of non-homology will result in a correspond- 
ing decrease in tiie frequency of gene targeting. If non- 
homology does exist between the homologous portion of the 
PNS vector and the appropriate region of the target DNA, it 
is preferred that the non-homology not be spread throughout 
the homologous portion but rather in discrete areas of the 
homologous portion. It is also preferred that the homologous 
portion of tiie PNS vector adjacent to tiie negative selection 
marker (fourth or fiftii DNA sequence) be 100% homolo- 
gous to the corresponding region in flie target DNA. This is 
to ensure maximum discontinuity between homologous and 
non-homologous sequences in the PNS vector. 

Increased frequencies of homologous recombination have 
been observed when ttie absolute amount of DNA sequence 
in the combined homologous portions of the first and second 
DNA sequence are increased. FIG. 4 depicts the targeting 
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frequency of the Hprt locus as a function of the extent of 
homology between an appropriate PNS vector and the 
endogenous target. A series of replacement (A) and insertion 
(•) Hprt vectors were constructed that varied in the extent 
of homology to the endogenous Hprt gene. Hprt sequences 5 
in each vector were interrupted in the eighth exon with the 
neomycin resistance gene. The amount of Hprt sequence 3' 
to the neogene was kept constant to the amount of Hprt 
sequence 5' to the neo was varied. The absolute frequency of 
independent targeting events per total ES cells electropo- lo 
rated is plotted in HG. 4 on the logarithmic scale as a 
function of the number of kilobases of Hprt sequence 
contained within the PNS vectors. See Capecchi, M. R. 
(1989), Science. 244, 1288-1292. 

As previously indicated, the fourth DNA sequence con- is 
taining the negative selection marker should have sufiScient 
non-homology to the target DNA sequence to prevent 
homologous recombination between the fourth DNA 
sequence and the target DNA. This is generally not a 
problem since it is unlikely that the negative selection 20 
marker chosen will have any substantial homology to the 
target DNA sequence. In any event, the sequence homology 
between the fourth DNA sequence and the target DNA 
sequence should be less than about 50%, most preferably 
less than about 30%. 25 

A preliminary assay for sufBcient sequence non-homol- 
ogy between the fourth DNA sequence and the target DNA 
sequence utilizes standard hybridization techniques. For 
example, the particular negative selection marker may be 
appropriately labded with a radioisotope or olho- detectable 30 
marker and used as a probe in a Southern blot analysis of the 
genomic DNA of the target celljlf little or no signal is 
detected under intermediate stringency conditions such as 
3XSSC when hybridized at about 55° C, that negative 
selection marker should be functional in a PNS vector 35 
designed for homologous recombination in that cell type. 
However, even if a signal is detected, it is not necessarily 
indicative that particular negative selection cannot be used 
in a PNS vector targeted for that genome. This is because the 
negative selection marker may be hybridizing with a region 40 
of the genome which is not in proximity with the target DNA 
sequence. Since the target DNA sequence is defined as those 
DNA sequences corresponding to first, second, third, and in 
some cases, fourth regions of the genome. Southern blots 
localizing the regions of the target DNA sequence may be 45 
performed. If the probe corresponding to the particular 
negative selection marker does not hybridize to these bands, 
it should be functional for PNS vectors directed to these 
regions of the genome. 

Hybridization between sequences encoding the negative 50 
selection marker and the genome or target regions of a 
genome, however, does not necessarily mean that such a 
negative selection marker will not function in a PNS vector. 
The hybridization assay is designed to detect those 
sequences which should function in the PNS vector because 55 
of their failure to hybridize to the target Ultimately, a DNA 
sequence encoding a negative selection marker is fimctional 
in a PNS vector if it is not integrated during homologous 
recombination regardless of whether or not it hybridizes 
with the target DNA. 60 

It is also possible that high stringency hybridization can 
be used to ascertain whether genes from one species can be 
targeted into related genes in a dififerent species. For 
example, preliminary gene therapy experiments may require 
that human genomic sequences replace the conesponding 
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related genomic sequence in mouse cells. High si 
hybridization conditions such as 0.1 XSSC at about 68° C. 
can be used to correlate hybridization signal imder such 
conditions with the ability of such sequences to act as 
homologous portions in the first and second DNA sequence 
of the PNS vector. Such experiments can be routinely 
performed with various genomic sequences having known 
differences in homology. The measure of hybridization may 
therefore correlate wifli the ability of such sequences to 
bring about acceptable frequencies of recombination. 

Table I identifies various positive and negative selection 
markers which may be used respectively in the third and 
fourth DNA sequences of the PNS vector together with the 
conditions used to select for or against cells expressing each 
of the selection markers. As for animal cells such as mouse 
L cells, ES cells, preferred positive selection markers 
include DNA sequences encoding neomycin resistance and 
hygromycin resistance, most preferably neomycin resis- 
tance. For plant cells preferred positive selection markers 
include neomycin resistance and bleomycin resistance, most 
preferably neomycin resistance. 

For animal cells, preferred negative selection markers 
include gpt and HSV-tk, most preferably HSV-tk. For plant 
cells, preferred negative selection markers iaclude Gpt and 
HSV-tk. As genes responsible for bacterial and fimgal patho- 
genesis in plants are cloned, other negative markers will 
become readily available. 

As used herein, a "positive screening marker" refers to a 
DNA sequence used in a phage rescue screening method to 
detect homologous recombination. An example of such a 
positive screeniDg marker is the supF gene which encodes a 
tyrosuae transfer RNA which is capable of suppressing 
amber mutations. See Smithies, et al. (1985), Nature, 317, 
230-234. 

The following is presented by way of example and is not 
to be construed as. a limitation on the scope of the invention. 

EXAMPLE! 

Inactivation at the int-2 locus in mouse ES cells 
1. PNS Vector Consttuction 

The PNS vector, pINT-2-N/TK, is described m Mansour, 
et al. (1988), Nature, 336, 349. This vector was used to 
disrupt the proto-oncogene, INT-2, in mouse ES cells. As 
shown in FIG. 5c, it contains DNA sequences 1 and 2 
homologous to the target INT-2 genomic sequences in 
mouse ES cells. These homologous sequences were 
obtained from a plasmid referred to as pAT-153 (Peters, et al. 

(1983) , Cell, 33, 369). DNA sequence 3, the positive selec- 
tion moiety of the PNS vector was the Neogene from the 
plasmid pMCINeo described in Thomas, et al. (1987), Cell, 
51, 503; DNA sequence 4, the negative selection element of 
the vector, was the HSV-IX gene derived from the plasmid 
pIC-19-R/TK which is viridely available in the scientific 
community. 

Plasmid pIC19RMCl-TK (FIG. Sd) contains the HSV- 
TK gene engineered for expression in ES cells (Mansour, et 
al. (1988), Nature, 336, 348-352). The TK gene, flanked by 
a duplication of a mutant polyoma virus enhancer, PYF441, 
has been inserted into the vector, pIC19R (Marsh, et al. 

(1984) . Gene, 32, 481-485) between the Xhol and the 
Hindm sites. The map of plasmid pIC19R/MCl-TK is 
shown in FIG. Sd. The enhancer sequence is as follows: 
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CIXX5AGCVVGTGTCKnTrTCAAGAGGAAGCAAAAAGCCTCTCCACX;CAGGC 
CrGGAArGTTrCCACCCAArGTCGAGCAGTGTGGTTTTGCAAGAGGAAGC 
AAAAAGCCTC TCCACCCAGG CCTGGAATGT TTCCACCCAA TGTCGAG 

y 



The 5' caid is an Xhol restriction enzyme site, the 3' end 
is contiguous with the HSV-TK gene. The HSV-TK lo 
sequences are from nucleotides 92-1799 (McKnight (1980), 
Nucl. Acids. Res., 8, 5949-5964) followed at the 3" end by 
a Hindm Imker. The plasmid pIC19R is essentially identical 
to the pUC vectors, with an altanative poly-linka- as shown 
in HG. 5d. j5 

Construction of the vector, pINT-2-N/TK involved five 
sequential steps as depicted in FIG. S. First, a 3,965 bp PstI 
fragment containing exon lb, was excised from pAT153 and 
inserted into the PstI site of Bluesciibe® (Stratagene of 
LaJolla, Calif.), an Amp" bacterial plasmid containing a 
multi-enzyme, cloning polylinker. Second, a synthetic Xhol 
linker of sequence 



GCTCGAGCGGCC 
llllllll 
CCGGCGAGCTCG 

was inserted into the Apal site on exon lb. Third, the 
Xhol-Sall Neo'^-fragment from pMCI Neo was inserted into 30 
the Xhol linker in exon lb. Fourth, the 3,965 bp INT-2 Est 
fragment containing the Neo'' gene was reinserted into 
pAT153, to generate the plasmid pINT-2-N as shown in FIG. 
Sb. This plasmid also includes the third exon of the int-2 
gene. Fifth, the Clal-HindH HSV-tk fragment from pIC-19- 35 
WTK was inserted into Qal-Hindll digested pINT2-N, 
creating the final product, pINT2-N/TK. This vector was 
linearized by digestion with Qal prior to its introduction 
into ES cells. 

2. Generation of ES Cells 40 
ES cells were derived from two sources. The first source 

was isolation directly from C57B1/6 blastocysts (Evans, et al. 
(1981), Nature, 292, 154-156) except that primary embry- 
onic fibroblasts (Doetschman, et al. (1985), J. EmbryoL Exp. 
Morpkol, 87, 27-45) were used as feeders rather than STO 45 
cells. Briefly, 2.5 days postpregnancy mice were ovariecto- 
mized, and delayed blastocysts were recovered 4-6 days 
later. The blastocysts were cultured on mitomycin C-inac- 
tivated primary embryonic fibroblasts. After blastocyst 
attachment and the outgrowth of the trophectoderm, ie 50 
ICM-derived clump was picked and disparsed by trypsin 
into clumps of 3-4 cells and put onto new feeders. All 
culturing was carried out in DMEM plus 20% FCS and 
lO^M p-mercaptoethanol. The cultures were examined 
daily. After 6-7 days in culture, colonies that still resembled 55 
ES cells were picked, dispersed into single cells, and 
replated on feeders. Those cell lines that retained the mor- 
phology and growth characteristic of ES cells were tested for 
pluripotency in vitro. These cell lines were maintained on 
feeders and transferred every 2-3 days. 60 

The second method was to utilize one of a number of ES 
cell lines isolated from other laboratories, e.g., CC1.2 
described by Kuehn, et al. (1987), Nature, 326, 295. The 
cells were grown on mitomycin C-inactivated STO cells. 
Cells from both sources behaved identically in gene target- 65 
ing experiments. 

3. Introduction of PNS Vector pINT-2-NyTK into ES cells 



The PNS vector pINT-2-N/TK was introduced into ES 
cells by electroporation using the Promega Biotech X-Cell 
2000. Rapidly growing cells were tiypsinized, washed in 
DMEM, counted and resuspended in buffer containing 20 
mM HEPES (pH 7.0), 137 mM NaCl, 5 mM KCl, 0.7 mM 
Na2HP04, 6 mM dexfrose, and 0.1 mM p-mercaptoethanol. 
Just prior to electroporation, the linearized recombinant 
vector was added. Approximately 25 ng of linearized PNS 
vector was mixed with 10'' ES cells in each 1 ml-cuvette. 

Cells and DNA were exposed to two sequential 625 V/cm 
pulses at room temperature, allowed to remain in the buffer 
for 10 minutes, then plated in non-selective media onto 
feeder cells. 

4. Selection of ES Cells Containing a Targeted Disruption 
of the int-2 Locus 

Following two days of non-selective growth, the cells 
were trypsinized and replated onto G418 (250 (ig/ml) media. 
The positive-selection was applied alone for three days, at 
which time the cells weib again trypsinized and replated in 
tiie presence of G418 and either gancyclovir (2xl0"*M) 
(Syntex, Palo Alto, Calif.) or l-(2-deoxy-2-fluoro-p-D-ara- 
bino-fiiranosy!- 5-iodouracil (F.I.A.U.) (lxlO~*M) (Bristol 
Myers). When the cells had grown to confluency, each plate 
of cells was divided into two aliquots, one of which was 
frozen in liquid Nj, tiie otiier harvested for DNA analysis. 

5. Formation of INT-2 disrapted transgenic mice 
Those transformed cells detetmined to be appropriately 

modified by flie PNS vector were grown in non-selective 
media for 2-5 days prior to injection into blastocysts accord- 
ing to the method of Bradley in Teratocarcinotnas and 
embryonic stem cells, a practical approach, edited by E. J. 
Robertson, IRL Press, Oxford (1987), p. 125. 

Blastocysts containing the targeted ES cells were 
implanted into pseudo-pregnant females and allowed to 
develop to term. Chimaeric offspring were identified by 
coat-color markers and those males showing chimaerism 
were selected for breeding offspring. Those offspring which 
cany the mutant allele can be identified by coat color, and 
the presence of tiie mutant allele reaffirmed by DNA analysis 
by tail-blot, DNA analysis. 

EXAMPLE 2 

Disruption at the hoxl.4 locus in mouse ES cells 

Disruption of the hoxl.4 locus was performed by methods 
similar to those described to dismpt the mt-2 locus. There 
were two major differences between these two disruption 
strategies. First, the PNS vector, pHOX1.4N/TK-TK2 (HG. 
6), used to disrupt the hoxl.4 locus contained two negative 
selection markers, i.e., a DNA sequence 5 encoding a second 
negative selection marker was included on the PNS vector at 
the end opposite to DNA sequence 4 encoding the first 
negative selection marker. DNA sequence 5 contained the tk 
gene isolated from HSV-type 2. It functioned as a negative- 
selectable marker by the same method as the original 
HSV-flc gene, but tfie two tic genes are 20% non-homolo- 
gous. This non-homology further inhibits recombination 
between DNA sequences 4 and 5 in the vector which might 
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have inhibited gene-targeting. The second difFraence 
between the int-2 and the hoxl.4 dismption strategies is that 
the vector pHOX1.4N/TK-TK2 contains a deletion of 1000 
bp of hoxl.4 sequences internal to the gene, i.e., DNA 
sequences 1 and 2 are not contiguous. 5 

The HSV-tk2 sequences used in this construction were 
obtained from pDG504 (Swain, M. A. et al. (1983), J. Virol, 
46, 1045). The structural TK gene from pDG504 was 
inserted adjacexit to the same promotes/enhancer sequences 
used to express both the Neo and HSV-tk genes, to gaierate lO 
the plasmid pIC20H/TK2. 

Construction of pHOX1.4N/TK-TK2 proceeded in five 
sequential steps as depicted in HG. 6. First a clone contain- 
ing hoxl.4 sequences was isolated from a genomic X library. 
The "k library was constructed by inserting EcoRI partially 
digested mouse DNA into the X-DASH® (Stratagene) clon- 
ing phage. The hoxl.4 containing phage were identified by 
virtue of their homology to a synthetic oligonucleotide 
synthesized from the published sequence of the hoxl.4 locus. 
Toumier-Lasserve, et al. (1989), Mol. Cell Biol, 9, 2273. ^° 
Second, a 9 kb Sall-Spel fragment containing the hoxl.4 
homeodomain was inserted into Bluescribe®. Third, a 1 kb 
Bgin fragment within the hoxl.4 locus was replaced with the 
Neo"" gene isolated from pMCI Neo, creating the plasmid 
pHOXl .4N. Fourth, the XhoI-SaH fragment by HSV-tk from ^ 
pIC19R/rK was inserted into the SaU site of pHOX1.4N, 
generating the plasmid pH0X1.4N/TK. Fifth, the Sall-Spel 
fragment from pHOXl .4N/TK was inserted into a Sall-Xbal 
digest of the plasmid pIC20HTK2, generating the final 
product, pHOX1.4N/TK/rK2. This vector was digested ^ 
with Sail to form a linear PNS vector which was transfected 
into mouse ES cells as described in Example 1. Positive- 
negative selection and the method of forming transgenic 
mice was also as described in Example 1. Southern blots of 
somatic cells demonstrate that the disrupted hoxl.4 gene was 
transferred to transgenic offspring. 

EXAMPLES 

40 

Inactivation of Other Hox Genes 

The methods described in Examples 1 and 2 have also 
been used to disrupt the hoxl.3, hoxl.6, hox2.3, and int-1 loci 
in ES cells. The genomic sequences for each of these loci 45 
(isolated from the same -Dash library containing the hoxl.4 
clone) were used to construct PNS vectors to target disrup- 
tion of these genes. All of these PNS vectors contain the 
Neo-gene from pMCi-Neo as the positive selection marker 
and the HSV-tk and HSV-tk2 sequences as negative selec- 50 
tion markers. 

TABLE V 



hoxl.3 llkb Xba-Hindlll Toumier-Iasserve, 



hoxl.6 13kb partial RI 
EMBO, 6, 2977 
hox2.3 12kb BamHl 

int-1 13kb Bgli 



EcoRI-sitein 

al. (1989), 
MCE, 9. 2273 

Baron, et aL (1987), Bgffl-ate in 
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EXAMPLE 4 
Vascular Graft Supplemraiting Factor Vm 

In this example, a functional factor VIII gene is targeted 
by a PNS vector to the P-actin locus in human endothelial 
cells. When so incorporated, the expression of factor Vin is 
controlled by the P-actin promoter, a promoter known to 
function in nearly all somatic cells, including fibroblasts, 
epithelial and endothelial cells. PNS vector construction is 
as follows: In step lA (FIG. 7A), the 13.8 kb EcoRI fragment 
containing the entire human P-actin gene from the A-phage, 
14TB (Leavitte, et al. (1984). Mol Cell Bio., 4, 1961) is 
inserted, using synthetic Ecorl/Xhol adaptors, into the Xhol 
site of the TK vector, pIC-19-R/TK to form plasmid pBact/ 
TK. See FIG. 7A. 

In step IB (FIG. 7B), the 7.2 kb SaH fragment from a 
factor vin cDNA clone including its native signal sequence 
(Kaufinan, et al. (1988), JBC, IGi, 6352; Toole, et al. 
(1986), Proc. Natl Acad. ScL U.S.A., 83, 5939) is inserted 
next to the Neo"" gene in a pMCI derivative plasmid. This 
places the neo'' gene (containing its own promoter/enhancer) 
3" to the polyadenylation site of factor VIII. This plasmid is 
designated pFVUMNeo. 

In step 2 (FIG. 7C), the factor VHI/Neo fragment is 
excised with Xhol as a single piece and inserted using 
synthetic Xhol/Ncol adaptors at the Ncol site encompassing 
the met-initiation codon in pBact/TK. This codon lies in the 
2nd exon of the P-actin gene, well away from the promoter, 
such that transcription and splicing of the mRNA is in the 
normal fashion. The vector so formed is designated pBact/ 
FVmMeo/TK. 

This vector is digested with either Clal or HindDI which 
acts in the polylinker adjacent to the TK gene. The linker 
vector is then introduced by electroporation into endothelial 
cells isolated from a hemophiliac patient. The cells are then 
selected for G418 and gancyclovir resistance. Those cells 
shown by DNA analysis to contain the factor Vm gene 
targeted to the p-actin locus or cells shown to express FVm 
are then seeded into a vascular graft which is subsequently 
implanted into the patient's vascular system. 

EXAMPLES 
Replacement of a mutant PNP gene in human bone 
marrow stem cells using PNS 

The genomic clone of a normal purine nucleoside phos- 
phonylase (PNP) gene, available as a 1Z4 kb, Xba-partial 
fragment CWilliams, et al. (1984), Nucl Adds Res, 12, 5779; 
WiUiams, et al. (1987), J: Biol Chem.. USl, 2332) is inserted 
at the Xbal site in the vector, pIC-lQ-R/TK. The neo'' gene 
from pMCI-Neo is inserted, using synthetic BamHI/XhoI 
linkers, into the BamHI site in intron 1 of the PNP gene. The 
linearized version of this vector (cut with Qal) is illustrated 
in FIG. 8. 

Bone marrow stem cells from PNP patients transfected 
with this vector are selected for neo'', gan'', in culture, and 
those cells exhibiting replacement of the mutant gene with 
the vector gene are transplanted into the patient. 

EXAMPLE 6 
Inactivation by insertional mutagenesis of the Hox 
1.1 locus in mouse ES cells, using a promoterless 
PNS vector 

A promoterless positive selection marker is obtained 
using the Neo* gene, excised at its 5' end by enzyme, EcoRI, 
from the plasmid, pMO-Neo. Such a digestion removes the 
Neo structural gene from its controlling elements. 
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A piomoterless PNS vector is used to insert the Neo gene 
into the Hox Vi gene in ES cells. The Hox 1.1 gene is 
expressed in cultured embryo cells (Colberg-Poley, et al. 
(1985), Nature, 314, 713) and the site of insertion, the 
second exon, lies 3' to the promoter of the gene (Kessel, et 5 
al. (1987), PNAS, 84, 5306; Zimmer, et al. (1989), Nature. 
338, 150). Expression of Neo will thus be dependent upon 
insertion at the Hox 1.1 locus. 

Vector construction is as follows: 

Step 1 — ^The neo gene, missing the transcriptional control 
sequences is removed from pMCI-Neo, and insarted into the 
second exon of the 11 kb, Fspl-Kpnl fiagmait of Hox 1.1 
(Kessel, et aL (1987), supra; Zimmer, et al. (1989), supra). 

Step 2 — ^The Hox 1.1 -Neo sequences is then inserted , 
adjacent to the HSV-fk gene is pIClQR/TK, creating the 
targeting vector, pHoxl.l-NA'K. The linearized version of 
this vector is shown in HG. 9 Hiis vector is electroporated 
into ES cells, which are then selected for Neo'', GanC. The 
majority of cells surviving this selection are predicted to 
contain targeted insertions of Neo at the Hoxl.l locus. 

EXAMPLE 7 

Inducible promoters 25 

PNS vectors are used to insert novel control elements, for 
example inducible promoters, into specific genetic loci. This 
permits the induction of specified proteins under the spatial 
and/or temporal control of the investigator. In this example, 
the MT-1 promoter is inserted by PNS into the Int-2 gene in 3° 
mouse ES cells. 

The inducible promoter from the mouse metallothionein-I 
(MT-I) locus is targeted to the Int-2 locus. Mice generated 
from ES cells containing this alteration have an Int-2 gene 
inducible by the presence of heavy metals. The expression of 
this gene in mammary cells is {n^dicted to result in onco- 
genesis and provides an opportunity to observe the induction 
of the disease. 

Vector construction is as follows: 

Step 1— The Ecorl-Bgin fragment from the MT-I gene 
(Palraiter, et al. (1982), Cell, 29, 701) is inserted by blunt- 
end ligation into the BSSHII site, 5' to the Int-2 stractural 
gene in the plasmid, pAT 153 (see discussion of Example 1). 

Step 2— The MQ-Neo gene is inserted into the Avrll site 45 
m intron 2 of the Int-2-MT-I construct. 

Step 3— The int-2-MT-ILNeo fragment is inserted into 
the vector, pIC 19R/rK, resulting m the construct shown in 
HG. 10. 

Introduction of this gene into mouse ES cells by elec- ^'^ 
troporation, followed by Neo'', GanC, selection results in 
cells containing the MT-I promoter inserted 5' to the Int-2 
gene. These ceils are then inserted into mouse blastocysts to 
generate mice carrying this particular allele. 

EXAMPLES 

Inactivation of the ALS-II gene in tobacco 
protoplasts by PNS 

A number of herbicides function by targeting specific 
plant metabolic enzymes. Mutant alleles of the genes encod- 
ing these enzymes have been identified which confer resis- 
tance to specific herbicides. Protoplasts containing these 
mutant alleles have been isolated in culture and grown to 65 
mature plants which retam the resistant phenotype (Botter- 
man, et al. (1988), TIGS, 4, 219; Gasser, et al. (1989), 
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Science, 244, 1293). One problem with this technology is 
that the enzymes involved are often active in multimer form, 
and are coded by more than one genetic locus. Thus, plants 
containing a normal (sensitive) allele at one locus and a 
resistant allele at another locus produce oizymes with mixed 
subunits which show unpredictable resistance characteris- 
tics. 

In this example, the gene product of the ALS genes 
(acetolactate synthase) is the target for both sulfonylurea and 
imidazolinone herbicides (Lee, et al. (1987), EMBO, 7, 
1241). Protoplasts resistant to these herbicides have been 
isolated and shown to contain mutations in one of the two 
ALS loci. A 10 kb Spel fragment of the ALS-II gene (Lee, 
et al. (1988), supra; Mazur, et al. (1987), Plant Phys., 85, 
1110) is subcloned into the negative selection vector, pIC- 
19R/rBL A neo'' gene, engineered for expression in plant 
cells with regulating sequences from the mannopine syn- 
thase gene for the TI plasmid is inserted into the EcoRl site 
in the coding region of the ALS-II. This PNS vector is 
transferred to the C3 tobacco cell line (Chalef, et al. (1984), 
Science, 223, 1148), carrying a chlorsulfuron'' allele in Als-L 

They are then selected for Neo'', GanC. Those cells 
survivmg selection are screened by DNA blots for candi- 
dates containing insertions in the ALS-II gene. 

Having described the preferred embodiments of the 
present invention, it will appear to those ordinarily skilled in 
the art that various modifications may be made to the 
disclosed embodiments, and that such modifications are 
intended to be within the scope of the present invention. 

'What is claimed is: 

1. A positive-negative selection (PNS) vector for modi- 
fying a target DNA sequence contained in the genomes of 
murine embryonic stem cells, said PNS vector comprising: 

a first homologous vector DNA sequence capable of 

homologous recombination with a first region of said 

target DNA sequence, 
a positive selection marker DNA sequence capable of 

conferring a positive selection characteristic in said 

cells, 

a second homologous vector DNA sequence capable of 
homologous recombination with a second region of 
said target DNA sequence, and 

a negative selection marker DNA sequence, capable of 
conferring a negative selection characteristic in said 
cells, thereby allowmg killing of said cells, but sub- 
stantially incapable of homologous recombination with 
said target DNA sequence, 

wherem the spatial order of said sequences in said PNS 
vector is: said first homologous vector DNA sequence, 
said positive selection marker DNA sequence, said 
second homologous vector DNA sequence and said 
negative selection marker DNA sequence as shown in 
HG. 1, 

wherein the 5'-3' orientation of said first homologous 
vector sequence relative to said second homologous 
vector sequence is the same as the 5-3' orientation of 
said first region relative to said second region of said 
target sequence; 

wherein the vector is capable of modifying said target 
DNA sequence by homologous recombination of said 
first homologous vector DNA sequence with said first 
region of said target sequence and of said second 
homologous vector DNA sequence with said second 
region of said taiget sequence. 

2. The PNS vector of claim 1 wherein said target DNA 
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contains exons and introns and said positive selection 
madcer DNA sequence farther contains the exon-intron and 
intron-exon splicing sequences for an intron in said tareet 
DNA. 

3. The PNS vector of claim 2 wherein said first or said ; 
second homologous vector DNA sequence contains at least 

a portion of an exon wherein one or more nucleotides have 
been substituted, deleted or inserted. 

4. The PNS vector of claim 1 wherein said target DNA 
sequence contains exons and introns and said first and j 
second homologous vector DNA sequences contain different 
portions of the same exon of said target DNA sequence. 

5. The PNS vector of claim 1 wherein said PNS vector has 
a length between 20 kb and 50 kb. 

6. The PNS vector of claim 1 wherein said first and said i 
second homologous vector DNA sequences have a length 
between 25 base pairs and 50,000 base pairs each. 

7. The PNS vector of claim 1 wherein said first and said 
second homologous vector DNA sequences have a length 
between 1,000 base pairs and 15,000 base pairs each. 2 

8. The PNS vector of claim 1 wherein said positive 
selection marker DNA sequence is selected from the group 
consisting of DNA sequences encoding neomycin resis- 
tance, hygromycin resistance, histidinol resistance, xanthine 
utilization and bleomycin resistance. 2 

9. The PNS vector of claim 8 wherein said positive 
selection marker is a DNA sequence encoding neomycin 



10. The PNS vector of claim 1 wherein said negative 
selection marker DNA sequence is selected from the group 30 
consisting of DNA sequences encoding Hprt, gpt, HSV-tk, 
diphtheria toxin, licin toxin and cytosine deaminase. 

11. The PNS vector of claim 1 wherein said negative 
selection marker is a DNA sequence encoding HSV-tk. 

12. The PNS vector of claim 1 wherein said first or said 35 
second homologous vector DNA sequence comprises a DNA 
sequence having a modification of said target DNA 



13. The PNS vector of claim 12 wherem said modification 

is an insertion of one or more nucleotides. 40 

14. The PNS vector of claim 1 wherein said first or said 
second homologous vector DNA sequence encodes the 
correction of a genetic defect in said target DNA sequence. 

15. The PNS vector of claim 14 wherein said genetic 
defect in said target DNA comprises the insertion of one or 45 
more nucleotides in said target DNA sequence. 

16. The PNS vector of claim 15 wherein said genetic 
defect is associated with hemoglobinopathies, deficiencies 
in circulatory fectors, intracellular enzymes or extracellular 
enzymes. 5^ 

17. The PNS vector of claim 1 wherein said PNS vector 
is linear. 

18. The PNS vector of claim 1 wherein said PNS vector 

is closed circular. 

19. A method for enriching for a transformed murine 55 
embryonic stem cell containing a modification in a target 
DNA sequence in the genome of said cell comprising: 

(a) transfecting cells capable of mediating homologous 
recombination with a positive-negative selection vector 
comprising: 60 
a first homologous vector DNA sequence capable of 

homologous recombination with a first region of said 

target DNA sequence, 
a positive selection marker DNA sequence capable of 

conferring a positive selection characteristic in said 65 

cells, 

a second homologous vector DNA sequence capable of 
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homologous recombination with a second region of 
said target DNA sequence, and 

a negative selection marker DNA sequence, capable of 
conferring a negative selection characteristic in said 
cells, thereby allowing killing of said cells but sub- 
stantially incapable of homologous recombination 
with said target DNA sequence, 

wherein the spatial order of said sequences in said PNS 
vector is: said first homologous vector DNA 
sequence, said positive selection marker DNA 
sequence, said second homologous vector DNA 
sequence and said negative selection marker DNA 
sequence as shown in FIG. 1, 

wherein the 5-3' orientation of said first homologous 
vector sequence relative to said second homologous 
vector sequence is the same as the 5-3' orientation of 
said first region relative to said second region of said 
target sequence; 

wherein the vector is capable of modifying the target 
DNA sequence by homologous recombination of 
said first and second homologous vector sequences 
with the first and second regions of said target 
sequence; 

(b) selecting for transformed cells in which said positive- 
negative selection vector has integrated into said target 
DNA sequence by homologous recombination by 
sequentially or simultaneously selecting against trans- 
formed cells containing said negative selection marker 
and selecting for cells containing said positive selection 
marker; and 

(c) analyzing the DNA of transformed cells surviving the 
selecting step to identify a cell containing the modifi- 
cation. 

20. The method of claim 19 wherein said target DNA 
contains exons and introns and said positive selection 
marker DNA sequence further contains the exon-intron and 
intron-exon splice sequences for an intron in said target 
DNA sequence. 

21. The method of claim 19 wherein said first or said 
second homologous vector DNA sequence contains at least 
a portion of an exon of said target DNA sequence wherein 
one or more nucleotides of said target sequence have been 
substituted, deleted or inserted. 

22. The method of claim 19 wherein said target DNA 
sequence contains exons and introns and said first and 
second homologous vector DNA sequences contain different 
portions of the same exon of said target DNA sequence. 

23. The method of claim 19 wherein said PNS vector has 
a length between 20 kb and 50 kb. 

24. The method of claim 19 wherein said first and said 
second homologous vector DNA sequences have a length 
between 1,000 base pairs and 15,000 base pairs each. 

25. The method of claim 19 wherein said positive selec- 
tion marker DNA sequence is selected fmm the group 
consisting of DNA sequences encoding neomycin resis- 
tance, hygromycin resistance, histidinol resistance, xanthine 
utilization and bleomycin resistance. 

26. The method of claim 25 wherein said positive selec- 
tion marker is a DNA sequence encoding neomycin resis- 
tance.. 

27. The method of claim 19 wherein said negative selec- 
tion marker DNA sequence is selected from the group 
consisting of DNA sequences encoding Hprt, gpt, HSV-tk, 
diphtheria toxin, ricin toxin or cytosine deaminase. 

28. The method of claim 27 wherein said negative selec- 
tion marker is a DNA sequence encoding HSV-tk. 

29. The method of claim 19 wherein said first or said 
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second homologous vector DNA sequence in said PNS 
vector further comprises a DNA sequence having a modi- 
fication of said target DNA sequence. 

30. The method of claim 29 wherein said modification is 

a substitution, insertion or deletion of one or more nude- 5 
otides. 

31. The PNS method of claim 19 wherein said first or said 
second homologous vector DNA sequences in said PNS 
vector encodes the correction of a genetic defect in said 
target DNA. 10 

32. The method of claim 31 wherein said genetic defect in 
said target DNA comprises the insertion of one or more 
nucleotides in said target DNA sequence. 

33. The method of claim 32 wherein said genetic defect is 
associated with hemoglobinopathies, deficiencies in circu- 15 
latory factors, extracellular enzymes or intracellular 
enzymes. 

34. The method of claim 19 wherein said PNS vector is 



35. The method of daim 19 wherein said PNS vector is 20 
closed circular. 

36. The PNS vector of claim 1, wherein said taiget DNA 
sequence is a gene. 

37. The PNS vector of claim 1, wherein said taiget DNA 
sequence is a regulatory sequence. 
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38. The PNS vector of claim 12, wherein said modifica- 
tion is a deletion of one or more nucleotides in said taiget 
DNA sequence. 

39. The PNS vector of claim 14, wherein said genetic 
defect in said target DNA comprises the deletion of one or 
more nucleotides in said target DNA sequence. 

40. The PNS vector of claim 12, wherein said modifica- 
tion is a substitution of one or more nucleotides in said target 
DNA sequence. 

41. The PNS vector of claim 14, wherein said genetic 
defect in said target DNA comprises the substitution of one 
or more nucleotides in said target DNA sequence. 

42. The method of claim 31, wherein said genetic defect 
in said target DNA comprises the deletion of one or more 
nucleotides in said target DNA sequence. 

43. The method of claim 31, wherein said genetic defect 
in said target DNA comprises the substitution of one or more 
nucleotides in said taiget DNA sequence. 

44. The method of claim 19 wherein said vector is a 
sequence replacement vector ■ and said iBrst and second 
homologous vector DNA sequences comprise c 
first and second regions in said target DNA sc 
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ABSTRACT 



Recombinational cloning is provided by the use of nucleic 
acids, vectors and methods, in vitro and in vivo, for moving 
or exchanging segments of DNA molecules using engi- 
neered recombination sites and recombination proteins to 
provide chimeric DNA molecules that have the desired 
characteristic(s) and/or DNA segment(s). 
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RECOMBINATIONAL CLONING USING and between wild-type loxP sites. Infection of E. coli ceUs 

ENGINEERED RECOMBINATION SITES that express the Ore recombinase with these phage vectors 

results in recombination between the loxP sites and the in 
CROSS-REFERENCE TO RELATED vivo excision of the plasmid replicon, including the cloned 

APPLICATIONS 5 cDNA. 

The present application is a continuation-in-part of U.S. P6sfai et al. {Nucl. Acids Res. 22:2392-2398 (1994)) 

application Ser. No. 08/486,139, filed Jun. 7, 1995, now discloses a method for inserting into genomic DNA partial 
abandoned which application is entirely incorporated herein expression vectors having a selectable marker, flanked by 
by reference. two wild-type FRT recognition sequences. FLP site-specific 

recombinase as present in the cells is used to integrate the 
BACKGROUND OF THE INVENTION vectors into the genome at predetermined sites. Under 

1. Field of the Invention conditions where the replicon is functional, this cloned 

lie DNA can be amplified. 



The present mvention relates to recombmant DNA tech- „ , , „,„ ^ ,. , , 

1 T^M* J . u • J u- Bebeeetal.(U.S. Pat. No. 5,434,066) discloses the use of 
nology. DNA and vectors having engineered recombmaUon is . ^ , • " ' V.xt 7 ■ 
sites are provided for use in a recombinational cloning si^-specific recombinases such as Cre for DNAcontaimng 
method that enables eflScient and specific recombination of ^« '^^^^ ^"^^ "^^^ recombination between the 
DNA segments using recombination proteins. The DNAs, 

vectors and methods are useful for a variety of DNA Boyd {Nucl. Acids Res. 21:817-821 (1993)) discloses a 
exchanges, such as subcloning of DNA, in vitro or in vivo. 20 m^^od to facilitate the cloning of blunt-ended DNA using 

2 Related Art conditions that encourage intermolecular ligation to a 
„' ,. .„ ,. dephosphorylated vector that contains a wild-type loxP site 
Site specific recombinases. Site specific recombinases are ^^^^^ ^ ^ site-specific recombinase present in E. 

enzymes that are present m some viruses and bactena and ^^^^ ^^jj^ 

have been characterized to have both endonuclease and ,1, . , , , /n^T xt r.-./mn-, j »r > • .1 .j 

ligase properties. TTiese recombinases (along with associ- 25 ^ ^"^f'^'^'"'' o£7. '° NucletcActds 

ated proteins in some cases) recognize specific sequences of f.'t ^1 (9):2265 (1993)) disclose an m yrvo method where 

bases in DNA and exchange the DNA segmente flanking I'S^^^"'' a particular antibody were cloned 

.1 . T', T • . • , in different nhase vectors between loxP and loxP 511 sites 

those segments. The recombinases and associated protems f ""Bj »«»-iuiau6iws6u iuat ouu iuaj: jxx aii^a 

are collectively referred to as "recombination proteins" (see, to transfect new £ co& ceUs. Cre, acting m the host 
e.g., Landy, A., Current Opinion in Biotechnology 30 ^^Ik on the two parental molecules (one plasmid one 

3-599_707 (1993)) phage), produced four products in equihbrium: two different 

^ . . ^ . cointegrates (produced by recombination at either loxP or 

Numerous recombmation systems from various organ- j^^p 5^-^ ^^^^-^ ^^^ ^^^^ molecules, one of which 

isms have been descnbed. See, e.g., Hoess et al.. Nucleic desired nroduct 

Ac/dsi;esearc/!l4(6):2287 (1986);Abremskietal.,J'. 5/0/. , . . . !u .'u . o u, , o. t, j 
^, -.m >/no^x A . „ r n ■ , IS Id contrast to the other related art, Schlake & Bode 

Chem. 261(1):391 (1986); Campbell, J. Bactenol. 174(23) 35 C1Q0A^^ ^-oM^... v 

:7495 (1992); Qian et al., J. Biol. Chem. 267(1 1):7794 33:12746-12751 (1994) discloses an m vivo 

(1992) Araki et al., J. Mol. Biol. 225(1):25 (1992); Maeser "^^'^ Z'^^r^' tZ'ZZ.T^T ^ 

and Kahnmann (1991) Mol. Gen. G^et. 230:170-176). °f ^""^ \ J^ti ^ • 1 

. , , , , . ^ ^ ' spacer-mutated FRT recombination site. A double-reciprocal 

Many of these belong to the mtegrase family of recom- crossover was mediated in cultured mammalian cells by 

bmases (Argos et al. £MBO J. 5:433-440 1986)). Perhaps ^^is FLP/FRT system for site-specific recombination, 

the best studied of these are the Inte^ase/att system from jransposases. The family of enzymes, the transposases, 

bacteriophage X (Landy A^ C«.re«r Op^^ tn Geneucs ,^ ^^^^^^^ information between 

and Devel 3:699-707 (1993)) the Cre/loxP system from ^^^^^ Transposons are siructurally variable, being 

bacteriophage PI (Hoess and Abremski (1990) In Nucleic ^ ^^J^^ c^^^-^A, but t^ically encode the 

Acids and Molecular Biolo^, vol. 4 Eds.: Eckstein and ..^ombinase gene flanked by DNA sequences organized in 

^r'o^ox' ^ ^""f^ Sprmger-Verlag; pp. 9(^109), ^^^^^ orientations. Integration of transposons can be 

and the FLP/FRT system from the Saccharorayces cere™ ^^^^^^ ^^^^ Representatives such as Tn7, 

2 ^ circle plasmid (Broach et al. Cell 29:227-234 (1982)). ^^^^ highly site-specific, have been applied to the ii! 

While these recombmation systems have been character- ^ivo movement of DNA segments between replicons 

ized for particular organisms, the related art has only taught (Lucklow et al., J. Virol. 67:4566-4579 (1993)). 

using recombinant DNA flanked by recombination sites, for j^^^^^ -^^y.^ ^^^^ 22:3765-3772 

in viva recombmation. ^^994^^ discloses the construction of artificial transposons 

Backman (U.S. Pat. No. 4,673,640) discloses the in vivo for the insertion of DNA segments, in vitro, into recipient 
use of X recombinase to recombine a protein producing 55 DNA molecules. The system makes use of the integrase of 

DNA segment by enzymatic site-specific recombination yeast TYl virus-like particles. The DNA segment of interest 

using wild-type recombination sites attB and attP js cloned, using standard methods, between the ends of the 

Hasan and Szybalski (Gene 56:145-151 (1987)) discloses transposon-like element TYl. In the presence of the TYl 

the use of X Int recombinase in vivo for intramolecular integrase, the resulting element integrates randomly into a 
recombination between wild type attP and attB sites which 60 second target DNA molecule. 

flank a promoter. Because the orientations of these sites are DNA cloning. The cloning of DNA segments currently 

inverted relative to each other, this causes an irreversible occurs as a daily routine in many research labs and as a 

flipping of the promoter region relative to the gene of prerequisite step in many genetic analyses. The purpose of 

interest. these clonings is various, however, two general purposes can 
Palazzolo et al. Gene 88:25-36 (1990), discloses phage 65 be considered: (1) the initial cloning of DNA from large 

lambda vectors having bacteriophage X arms that contain DNA or RNA segments (chromosomes, YACs, PCR 

" ;s positioned outside a cloned DNA sequence fragments, mRNA, etc.), done in a relative handful of known 
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vectors such as pUC, pGem, pBlueScript, and (2) the 
subcloning of these DNA segments into specialized vectors 
for functional analysis. A great deal of time and effort is 
expended both in the initial cloning of DNA segments and 
in the transfer of DNA segments from the initial cloning 5 
vectors to the more specialized vectors. This transfer is 
called subcloning. 

The basic methods for cloning have been known for many 
years and have changed little during that time. A typical 
cloning protocol is as follows: lo 

(1) digest the DNA of interest with one or two restriction 
enzymes; 

(2) gel purify the DNA segment of interest when known; 

(3) prepare the vector by cutting with appropriate restric- 
tion enz3mes, treating with alkaline phosphatase, gel 
purify etc., as appropriate; 

(4) Ugate the DNA segment to vector, with appropriate 
controls to estimate background of uncut and self- 
ligated vector; 20 

(5) introduce the resulting vector into an E. coli host cell; 

(6) pick selected colonies and grow small cultures over- 
night; 

(7) make DNA minipreps; and 

(8) analyze the isolated plasmid on agarose gels (often 25 
after diagnostic restriction enzyme digestions) or by 
PGR. 

The specialized vectors used for subcloning DNA seg- 
ments are functionally diverse. These include but are not 
limited to: vectors for expressing genes in various organ- 30 
isms; for regulating gene expression; for providing tags to 
aid in protein purification or to allow tracking of proteins in 
cells; for modifying the cloned DNA segment (e.g., gener- 
ating deletions); for the synthesis of probes (e.g, 
riboprobes); for the preparation of templates for DNA 35 
sequencing; for the identification of protein coding regions; 
for the fusion of various protein-coding regions; to provide 
large amounts of the DNA of interest, etc. It is common that 
a particular investigation will involve subcloning the DNA 
segment of interest into several different specialized vectors. 40 

As known in the art, simple subclonings can be done in 
one day (e.g., the DNA segment is not large and the 
restriction sites are compatible with those of the subcloning 
vector). However, many other subclonings can take several 
weeks, especially those involving unknown sequences, long 45 
fragments, toxic genes, unsuitable placement of restriction 
sites, high backgrounds, impure enzymes, etc. Subcloning 
DNA fragments is thus often viewed as a chore to be done 
as few times as possible. 

Several methods for facilitating the cloning of DNA 50 
segments have been described, e.g., as in the following 
references. 

Ferguson, J., et al. Gene 16:191 (1981), discloses a family 
of vectors for subcloning fragments of yeast DNA. The 
vectors encode kanamycin resistance. Clones of longer yeast 55 
DNA segments can be partially digested and ligated into the 
subcloning vectors. If the original cloning vector conveys 
resistance to ampicillin, no purification is necessary prior to 
transformation, since the selection will be for kanamycin. 

Hashimoto-Gotoh, T, et al. Gene 41:125 (1986), discloses 60 
a subcloning vector with unique cloning sites within a 
streptomycin sensitivity gene; in a streptomycin-resistant 
host, only plasmids with inserts or deletions in the dominant 
sensitivity gene will survive streptomycin selection. 

Accordingly, traditional subcloning methods, using 65 
restriction enzymes and ligase, are time consuming and 
relatively unrehable. Considerable labor is expended, and if 



two or more days later the desired subclone can not be found 
among the candidate plasmids, the entire process must then 
be repeated with alternative conditions attempted. Although 
site specific recombinases have been used to recombine 
DNA in vivo, the successful use of such enzymes in vitro 
was expected to suffer from several problems. For example, 
the site specificities and efSciencies were expected to differ 
in vitro; topologically-linked products were expected; and 
the topology of the DNA substrates and recombination 
proteins was expected to differ significantly in vitro (see, 
e.g., Adams et al, J. Mol. Biol. 226:661-73 (1992)). Reac- 
tions that could go on for many hours in vivo were expected 
to occur in significantly less time in vitro before the enzymes 
became inactive. Multiple DNA recombination products 
were expected in the biological host used, resulting in 
unsatisfactory reliability, specificity or efficiency of subclon- 
ing. In vitro recombination reactions were not expected to be 
sufficiently efiScient to yield the desired levels of product. 

Accordingly, there is a long felt need to provide an 
alternative subcloning system that provides advantages over 
the known use of restriction enzymes and ligases. 
SUMMARY OF THE INVENTION 

The present invention provides nucleic acid, vectors and 
methods for obtaining chimeric nucleic acid using recom- 
bination proteins and engineered recombination sites, in 
vitro or in vivo. These methods are highly specific, rapid, 
and less labor intensive than what is disclosed or suggested 
in the related background art. The improved specificity, 
speed and yields of the present invention facilitates DNA or 
RNA subcloning, regulation or exchange useful for any 
related purpose. Such purposes include in vitro recombina- 
tion of DNA segments and in vitro or in vivo insertion or 
modification of transcribed, replicated, isolated or genomic 
DNA or RNA. 

The present invention relates to nucleic acids, vectors and 
methods for moving or exchanging segments of DNA using 
at least one engineered recombination site and at least one 
recombination protein to provide chimeric DNA molecules 
which have the desired characteristic(s) and/or DNA 
segment(s). Generally, one or more parent DNA molecules 
are recombined to give one or more daughter molecules, at 
least one of which is the desired Product DNA segment or 
vector. The invention thus relates to DNA, RNA, vectors and 
methods to effect the exchange and/or to select for one or 
more desired products. 

One embodiment of the present invention relates to a 
method of making chimeric DNA, which comprises 

(a) combining in vitro or in vivo 

(i) an Insert Donor DNA molecule, comprising a desired 
DNA segment flanked by a first recombination site and 
a second recombination site, wherein the first and 
second recombination sites do not recombine with each 

(ii) a Vector Donor DNA molecule containing a third 
recombination site and a fourth recombination site, 
wherein the third and fourth recombination sites do not 
recombine with each other; and 

(iii) one or more site specific recombination proteins 
capable of recombining the first and third recombina- 
tional sites and/or the second and foturth recombina- 

thereby allowing recombination to occur, so as to produce 
at least one Cointegrate DNA molecule, at least one 
desired Product DNA molecule which comprises said 
desired DNA segment, and optionally a Byproduct 
DNA molecule; and then, optionally. 
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(b) selecting for the Product or Byproduct DNA molecule. 

Another embodiment of the present invention relates to a 
kit comprising a carrier or receptacle being compartmental- 
ized to receive and hold therein at least one container, 
wherein a first container contains a DNA molecule compris- 
ing a vector having at least two recombination sites flanldng 
a cloning site or a Selectable marker, as described herein. 
The kit optionally further comprises: 

(i) a second container containing a Vector Donor plasmid 
comprising a subcloning vector and/or a Selectable 
marker of which one or both are flanked by one or more 
engineered recombination sites; and/or 

(ii) a third container containing at least one recombination 
protein which recognizes and is capable of recombining 
at least one of said recombination sites. 

Other embodiments include DNA and vectors useful in 
the methods of the present invention. In particular, Vector 
Donor molecules are provided in one embodiment, wherein 
DNA segments within the Vector Donor are separated either 
by, (i) in a circular Vector Donor, at least two recombination 
sites, or (ii) in a linear Vector Donor, at least one recombi- 
nation site, where the recombination sites are preferably 
engineered to enhance specificity or efSciency of recombi- 
nation. 

One Vector Donor embodiment comprises a first DNA 

segment and a second DNA segment, the first or second 
segment comprising a Selectable marker. A second Vector 
Donor embodiment comprises a first DNA segment and a 
second DNA segment, the first or second DNA segment 
comprising a toxic gene. A third Vector Donor embodiment 
comprises a first DNA segment and a second DNA segment, 
the first or second DNA segment comprising an inactive 
fragment of at least one Selectable marker, wherein the 
inactive fragment of the Selectable marker is capable of 
reconstituting a functional Selectable marker when recom- 
bined across the first or second recombination site with 
another inactive fragment of at least one Selectable marker. 

The present reconibinational cloning method possesses 
several advantages over previous in vivo methods. Since 
single molecules of recombination products can be intro- 
duced into a biological host, propagation of the desired 
Product DNA in the absence of other DNA molecules (e.g., 
starting molecules, intermediates, and by-products) is more 
readily realized. Reaction conditions can be freely adjusted 
in vitro to optimize enzyme activities. DNA molecules can 
be incompatible with the desired biological host (e.g, YACs, 
genomic DNA, etc.), can be used. Recombination proteins 
from diverse sources can be employed, together or sequen- 
tially. 

Other embodiments will be evident to those of ordinary 
skill in the art fi-om the teachings contained herein in 
combination with what is known to the art. 

BRIEF DESCRIPTION OF THE FIGURES 

FIG. 1 depicts one general method of the present 
invention, wherein the starting (parent) DNA molecules can 

be circular or linear. The goal is to exchange the new 
subcloning vector D for the original cloning vector B. It is 
desirable in one embodiment to select for AD and against all 
the other molecules, including the Cointegrate. The square 
and circle are sites of recombination: e.g., loxP sites, att 
sites, etc. For example, segment D can contain expression 
signals, new drug markers, new origins of replication, or 
specialized functions for mapping or sequencing DNA. 

FIG. 2A depicts an in vitro method of recombining an 
Insert Donor plasmid (here, pEZC705) with a Vector Donor 
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plasmid (here, pEZC726), and obtaining Product DNA and 
Byproduct daughter molecules. The two recombination sites 
are attP and loxP on the Vector Donor. On one segment 
defined by these sites is a kanamycin resistance gene whose 

5 promoter has been replaced by the tetOP operator/promoter 
fi:om transposon TnlO. See Sizemore et ah, Nucl. Acids Res. 
18(10):2875 (1990). In the absence of tet repressor protein, 
E. coli RNA polymerase transcribes the kanamycin resis- 
tance gene jfrom the tetOP. If tet repressor is present, it binds 

10 to tetOP and blocks transcription of the kanamycin resis- 
tance gene. The other segment of pEZC726 has the tet 
repressor gene expressed by a constitutive promoter. Thus 
cells transformed by pEZC726 are resistant to 
chloramphenicol, because of the chloramphenicol acetyl 

15 transferase gene on the same segment as tetR, but are 
sensitive to kanamycin. The recombinase-mediated reac- 
tions result in separation of the tetR gene from the regulated 
kanamycin resistance gene. This separation results in kana- 
mycin resistance in cells receiving only the desired recom- 

20 bination products. Tlie first recombination reaction is driven 
by the addition of the recombinase called Integrase. The 
second recombination reaction is driven by adding the 
recombinase Cre to the Cointegrate (here, pEZC7 
Cointegrate). 

25 FIG. 2B depicts a restriction map of pEZC705. 

FIG. 2C depicts a restriction map of pEZC726. 

FIG. 2D depicts a restriction map of pEZC7 Cointegrate. 

FIG. 2E depicts a restriction map of Intprod. 
30 FIG. 2F depicts a restriction map of Intbypro. 

FIG. 3A depicts an in vitro method of recombining an 
Insert Donor plasmid (here, pEZC602) with a Vector Donor 
plasmid (here, pEZC629), and obtaining Product (here, 
EZC6prod) and Byproduct (here, EZC6Bypr) daughter mol- 
35 ecules. The two recombination sites are loxP and loxP 511. 
One segment of pEZC629 defined by these sites is a kana- 
mycin resistance gene whose promoter has been replaced by 
the tetOP operator/promoter from transposon TnlO. In the 
absence of tet repressor protein, E. coli RNA polymerase 
40 transcribes the kanamycin resistance gene fi-om the tetOP. If 
tet repressor is present, it binds to tetOP and blocks tran- 
scription of the kanamycin resistance gene. The other seg- 
ment of pEZC629 has the tet repressor gene expressed by a 
constitutive promoter. Thus cells transformed by pEZC629 
45 are resistant to chloramphenicol, because of the chloram- 
phenicol acetyl transferase gene on the same segment as 
tetR, but are sensitive to kanamycin. The reactions result in 
separation of the tetR gene from the regulated kanamycin 
resistance gene. This separation results in kanamycin resis- 
50 tance in cells receiving the desired recombination product. 
The first and the second recombination events are driven by 
the addition of the same recombinase, Cre. 

FIG. 3B depicts a restriction map of EZC6Bypr. 

FIG. 3C depicts a restriction map of EZC6prod. 

FIG. 3D depicts a restriction map of pEZC602. 

FIG. 3E depicts a restriction map of pEZC629. 

FIG. 3F depicts a restriction map of EZC6coint. 

FIG. 4A depicts an application of the in vitro method of 
60 recombinational cloning to subclone the chloramphenicol 
acetyl transferase gene into a vector for expression in 
eukaryotic cells. The Insert Donor plasmid, pEZC843, is 
comprised of the chloramphenicol acetyl transferase gene of 
E. coli, cloned between loxP and attB sites such that the loxP 
65 site is positioned at the 5'-end of the gene. The Vector Donor 
plasmid, pEZClOOB, contains the cytomegalovirus eukary- 
otic promoter apposed to a loxP site. The supercoiled 
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e combined with lambda Integrase and Cre 
recombinase in vitro. After incubation, competent E. coli 
cells were transformed with the recombinational reaction 
solution. Aliquots of transformations were spread on agar 
plates containing kanamycin to select for the Product mol- 
ecule (here CMVProd). 
FIG. 4B depicts a restriction map of pEZC843. 
FIG. 4C depicts a restriction map of pEZC1003. 
FIG. 4D depicts a restriction map of CMVBypro. 
FIG. 4E depicts a restriction map of CMVProd. 
FIG. 4F depicts a restriction map of CMVcoint. 
FIG. 5A depicts a vector diagram of pEZC1301. 
FIG. 5B depicts a vector diagram of pEZC1305. 
FIG. 5C depicts a vector diagram of pEZC1309. 
FIG. 5D depicts a vector diagram of pEZC1313. 
FIG. 5E depicts a vector diagram of pEZC1317. 
FIG. 5F depicts a vector diagram of pEZC1321. 
FIG. 5G depicts a vector diagram of pEZC1405. 
FIG. 5H depicts a vector diagram of pEZC1502. 
FIG. 6A depicts a vector diagram of pEZC1603. 
FIG. 6B depicts a vector diagram of pEZC1706. 
FIG. 7A depicts a vector diagram of pEZC2901. 
FIG. 7B depicts a vector diagram of pEZC2913 
FIG. 7C depicts a vector diagram of pEZC3101. 
FIG. 7D depicts a vector diagram of pEZC1802. 
FIG. 8A depicts a vector diagram of pGEX-2TK. 
FIG. 8B depicts a vector diagram of pEZC3501. 
FIG. 8C depicts a vector diagram of pEZC3601. 
FIG. 8D depicts a vector diagram of pEZC3609. 
FIG. 8E depicts a vector diagram of pEZC3617. 
FIG. 8F depicts a vector diagram of pEZC3606. 
FIG. 8G depicts a vector diagram of pEZC3613. 
FIG. 8H depicts a vector diagram of pEZC3621. 
FIG. 81 depicts a vector diagram of GST-CAT. 
FIG. 8J depicts a vector diagram of GST-phoA. 
FIG. 8K depicts a vector diagram of pEZC3201. 



It is unexpectedly discovered in the present invention that 
subcloning reactions can be provided using recombinational 
cloning. Recombination cloning according to the present 
invention uses DNAs, vectors and methods, in vitro and in 
vivo, for moving or exchanging segments of DNA mol- 
ecules using engineered recombination sites and recombi- 
nation proteins. These methods provide chimeric DNA mol- 
ecules that have the desired characteristic(s) and/or DNA 
segment(s). 

The present invention thus provides nucleic acid, vectors 
and methods for obtaining chimeric nucleic acid using 
recombination proteins and engineered recombination sites, 
in vitro or in vivo. These methods are highly specific, rapid, 
and less labor intensive than what is disclosed or suggested 
in the related background art. The improved specificity, 
speed and yields of the present invention facilitates DNA or 
RNA subcloning, regulation or exchange useful for any 
related purpose. Such purposes include in vitro recombina- 
tion of DNA segments and in vitro or in vivo insertion or 
modification of transcribed, replicated, isolated or genomic 
DNA or RNA. 

Definitions 

In the description that follows, a number of terms used in 
recombinant DNA technology are utilized extensively. In 



order to provide a clear and consistent understanding of the 
specification and claims, including the scope to be given 
such terms, the following definitions are provided. 

Byproduct: is a daughter molecule (a new clone produced 

5 after the second recombination event during the recombi- 
national cloning process) lacking the DNA which is desired 
to be subcloned. 

Cointegrate: is at least one recombination intermediate 
DNA molecule of the present invention that contains both 

JO parental (starting) DNA molecules. It will usually be circu- 
lar. In some embodiments it can be linear. 

Host: is any prokaryotic or eukaryotic organism that can 
be a recipient of the recombinational cloning Product. A 
"host," as the term is used herein, includes prokaryotic or 
eukaryotic organisms that can be genetically engineered. For 
examples of such hosts, see Maniatis et al., Molecular 
Cloning. A Laboratory Manual, Cold Spring Harbor 
Laboratory, Cold Spring Harbor, N.Y. (1982). 

Insert: is the desired DNA segment (segment A of FIG. 1) 
which one wishes to manipulate by the method of the present 

^° invention. The insert can have one or more genes. 

Insert Donor: is one of the two parental DNA molecules 
of the present invention which carries the Insert. The Insert 
Donor DNA molecule comprises the Insert flanked on both 
sides with recombination signals. The Insert Donor can be 

25 linear or circular. In one embodiment of the invention, the 
Insert Donor is a circular DNA molecule and further com- 
prises a cloning vector sequence outside of the recombina- 
tion signals (see FIG. 1). 

Product: is one or both the desired daughter molecules 

30 comprising the A and D or B and C sequences which are 
produced after the second recombination event during the 
recombinational cloning process (see FIG. 1). The Product 
contains the DNA which was to be cloned or subcloned. 
Promoter: is a DNA sequence generally described as the 

35 5'-region of a gene, located proximal to the start codon. The 
transcription of an adjacent DNA segment is initiated at the 
promoter region. A repressible promoter's rate of transcrip- 
tion decreases in response to a repressing agent. An induc- 
ible promoter's rate of transcription increases in response to 
an inducing agent. A constitutive promoter's rate of tran- 
scription is not specifically regulated, though it can vary 
under the influence of general metabolic conditions. 

Recognition sequence: Recognition sequences are par- 
ticular DNA sequences which a protein, DNA, or RNA 
molecule (e.g., restriction endonuclease, a modification 

''^ methylase, or a recombinase) recognizes and binds. For 
example, the recognition sequence for Cre recombinase is 
loxP which is a 34 base pair sequence comprised of two 13 
base pair inverted repeats (serving as the recombinase 
binding sites) flanking an 8 base pair core sequence. See 

so FIG. 1 of Sauer, B., Current Opinion in Biotechnology 
5:521-527 (1994). Other examples of recognition sequences 
are the attB, attP, attL, and attR sequences which are 
recognized by the recombinase enzyme X Integrase. attB is 
an approximately 25 base pair sequence containing two 9 

55 base pair core-type Int binding sites and a 7 base pair overlap 
region. attP is an approximately 240 base pair sequence 
containing core-type Int binding sites and arm-type Int 
binding sites as well as sites for auxiliary proteins IHF, FIS, 
and Xis. See Landy, Current Opinion in Biotechnology 

go 3:699-707 (1993). Such sites are also engineered according 
to the present invention to enhance methods and products. 

Recombinase: is an enzyme which catalyzes the exchange 
of DNA segments at specific recombination sites. 
Recombinational Cloning: is a method described herein, 

65 whereby segments of DNA molecules are exchanged, 
inserted, replaced, substituted or modified, in vitro or in 
vivo. 
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Recombination proteins: include excisive or integrative 
proteins, enzymes, co-factors or associated proteins that are 
involved in recombination reactions involving one or more 
recombination sites. See, Landy (1994), infra. 

Repression cassette: is a DNA segment that contains a 
repressor of a Selectable marker present in the subcloning 

Selectable marker: is a DNA segment that allows one to 
select for or against a molecule or a cell that contains it, 
often under particular conditions. These markers can encode 
an activity, such as, but not hmited to, production of RNA, 
peptide, or protein, or can provide a binding site for RNA, 
peptides, proteins, inorganic and organic compounds or 
compositions and the like. Examples of Selectable markers 
include but are not limited to: (1) DNAsegments that encode 
products which provide resistance against otherwise toxic 
compounds (e.g., antibiotics); (2) DNA segments that 
encode products which are otherwise lacking in the recipient 
cell (e.g., tRNA genes, auxotrophic markers); (3) DNA 
segments that encode products which suppress the activity 
of a gene product; (4) DNA segments that encode products 
which can be readily identified (e.g., phenotypic markers 
such as |3-galactosidase, green fluorescent protein (GFP), 
and cell surface proteins); (5) DNA segments that bind 
products which are otherwise detrimental to cell survival 
and/or function; (6) DNAsegments that otherwise inhibit the 
activity of any of the DNA segments described in Nos. 1-5 
above (e.g., antisense oligonucleotides); (7) DNA segments 
that bind products that modify a substrate (e.g. restriction 
endonucleases); (8) DNA segments that can be used to 
isolate a desired molecule (e.g specific protein binding 
sites); (9) DNA segments that encode a specific nucleotide 
sequence which can be otherwise non-functional (e.g., for 
PGR amplification of subpopulations of molecules); and/or 
(10) DNA segments, which when absent, directly or indi- 
rectly confer sensitivity to particular compounds. 

Selection scheme: is any method which allows selection, 
enrichment, or identification of a desired Product or Product 
(s) from a mixture containing the Insert Donor, Vector 
Donor, and/or any intermediates, (e.g. a Cointegrate) 
Byproducts. The selection schemes of one preferred 
embodiment have at least two components that are either 
linked or unlinked during recombinational cloning. One 
component is a Selectable marker. The other component 
controls the expression in vitro or in vivo of the Selectable 
marker, or survival of the cell harboring the plasmid carrying 
the Selectable marker. Generally, this controlling element 
will be a repressor or inducer of the Selectable marker, but 
other means for controlling expression of the Selectable 
marker can be used. Whether a repressor or activator is used 
will depend on whether the marker is for a positive or 
negative selection, and the exact arrangement of the various 
DNA segments, as will be readily apparent to those skilled 
in the art. A preferred requirement is that the selection 
scheme results in selection of or enrichment for only one or 
more desired Products. As defined herein, to select for a 
DNA molecule includes (a) selecting or enriching for the 
presence of the desired DNA molecule, and (b) selecting or 
enriching against the presence of DNA molecules that are 
not the desired DNA molecule. 

In one embodiment, the selection schemes (which can be 
carried out reversed) will take one of three forms, which will 
be discussed in terms of FIG. 1. The first, exemplified herein 
with a Selectable marker and a repressor therefor, selects for 
molecules having segment D and lacking segment C. The 
second selects against molecules having segment C and for 
molecules having segment D. Possible embodiments of the 



second form would have a DNA segment carrying a gene 
toxic to cells into which the in vitro reaction products are to 
be introduced. A toxic gene can be a DNA that is expressed 
as a toxic gene product (a toxic protein or RNA), or can be 
5 toxic in and of itself. (In the latter case, the toxic gene is 
understood to carry its classical definition of "heritable 
trait".) 

Examples of such toxic gene products are well known in 
the art, and include, but are not limited to, restriction 

10 endonucleases (e.g., Dpnl) and genes that kill hosts in the 
absence of a suppressing function, e.g., kicB. A toxic gene 
can alternatively be selectable in vitro, e.g., a restriction site. 

In the second form, segment D carries a Selectable 
marker. The toxic gene would eliminate transformants har- 
boring the Vector Donor, Cointegrate, and Byproduct 
molecules, while the Selectable marker can be used to select 
for cells containing the Product and against cells harboring 
only the Insert Donor. 

The third form selects for cells that have both segments A 
and D in cis on the same molecule, but not for cells that have 
both segments in trans on different molecules. This could be 
embodied by a Selectable marker that is split into two 
inactive fragments, one each on segments A and D. 

25 The fragments are so arranged relative to the recombina- 
tion sites that when the segments are brought together by the 
recombination event, they reconstitute a functional Select- 
able marker. For example, the recombinational event can 
link a promoter with a structural gene, can link two frag- 
ments of a structural gene, or can link genes that encode a 
heterodimeric gene product needed for survival, or can link 
portions of a replicon. 

Site-specific recombinase: is a type of recombinase which 
typically has at least the following four activities: (1) 

35 recognition of one or two specific DNA sequences; (2) 
cleavage of said DNA sequence or sequences; (3) DNA 
topoisomerase activity involved in strand exchange; and (4) 
DNA ligase activity to reseal the cleaved strands of DNA. 
See Sauer, B., Current Opinions in Biotechnology 

40 5:521-527 (1994). Conservative site-specific recombination 
is distinguished from homologous recombination and trans- 
position by a high degree of specificity for both partners. The 
strand exchange mechanism involves the cleavage and 
rejoining of specific DNA sequences in the absence of DNA 

45 synthesis (Landy, A. (1989) Ann. Rev. Biochem. 
58:913-949). 

Subcloning vector: is a cloning vector comprising a 
circular or linear DNA molecule which includes an appro- 
priate replicon. In the present invention, the subcloning 

50 vector (segment D in FIG. 1) can also contain functional 
and/or regulatory elements that are desired to be incorpo- 
rated into the final product to act upon or with the cloned 
DNA Insert (segment A in FIG. 1). The subcloning vector 
can also contain a Selectable marker (contained in segment 

55 C in FIG. 1). 

Vector: is a DNA that provides a useful biological or 
biochemical property to an Insert. Examples include 
plasmids, phages, and other DNA sequences which are able 
to replicate or be replicated in vitro or in a host cell, or to 

60 convey a desired DNA segment to a desired location within 
a host cell. A Vector can have one or more restriction 
endonuclease recognition sites at which the DNA sequences 
can be cut in a determinable fashion without loss of an 
essential biological function of the vector, and into which a 

65 DNA fragment can be spliced in order to bring about its 
replication and cloning. Vectors can fiirther provide primer 
sites, e.g., for PGR, transcriptional and/or translational ini- 
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tiation and/or regulation sites, recombinational signals, 
replicons. Selectable markers, etc. Clearly, methods of 
inserting a desired DNA fragment which do not require the 
use of homologous recombination or restriction enzymes 
(such as, but not limited to, UDG cloning of PGR fragments 
(U.S. Pat. No. 5,334,575, entirely incorporated herein by 
reference), T:A cloning, and the like) can also be applied to 
clone a fragment of DNA into a cloning vector to be used 
according to the present invention. The cloning vector can 
further contain a Selectable marker suitable for use in the 
identification of cells transformed with the cloning vector. 

Vector Donor: is one of the two parental DNA molecules 
of the present invention which carries the DNA segments 
encoding the DNA vector which is to become part of the 
desired Product. The Vector Donor comprises a subcloning 
vector D (or it can be called the cloning vector if the Insert 
Donor does not already contain a cloning vector) and a 
segment C flanked by recombination sites (see FIG. 1). 
Segments C and/or D can contain elements that contribute to 
selection for the desired Product daughter molecule, as 
described above for selection schemes. The recombination 
signals can be the same or different, and can be acted upon 
by the same or different recombinases. In addition, the 
Vector Donor can be linear or circular. 

Description 

One general scheme for an in vitro or in vivo method of 
the invention is shown in FIG. 1, where the Insert Donor and 
the Vector Donor can be either circular or linear DNA, but 
is shown as circular. Vector D is exchanged for the original 
cloning vector A. It is desirable to select for the daughter 
vector containing elements A and D and against other 
molecules, including one or more Cointegrate(s). The square 
and circle are different sets of recombination sites (e.g., lox 
sites or att sites). Segment A or D can contain at least one 
Selection Marker, expression signals, origins of replication, 
or specialized functions for detecting, selecting, expressing, 
mapping or sequencing DNA, where D is used in this 
example. 

Examples of desired DNA segments that can be part of 
Element A or D include, but are not limited to, PGR 
products, large DNA segments, genomic clones or 
fragments, cDNA clones, functional elements, etc., and 
genes or partial genes, which encode useful nucleic acids or 
proteins. Moreover, the recombinational cloning of the 
present invention can be used to make ex vivo and in vivo 
gene transfer vehicles for protein expression and/or gene 
therapy. 

In FIG. 1, the scheme provides the desired Product as 
containing vectors D and A, as follows. The Insert Donor 
(containing A and B) is first recombined at the square 
recombination sites by recombination proteins, with the 
Vector Donor (containing C and D), to form a Co-integrate 
having each of A-D-C-B. Next, recombination occurs at the 
circle recombination sites to form Product DNA (A and D) 
and Byproduct DNA (C and B). However, if desired, two or 
more different Co-integrates can be formed to generate two 
or more Products. 

In one embodiment of the present in vitro or in vivo 
recombinational cloning method, a method for selecting at 
least one desired Product DNA is provided. This can be 
understood by consideration of the map of plasmid 
pEZC726 depicted in FIG. 2. The two exemplary recombi- 
nation sites are attP and loxP. On one segment defined by 
these sites is a kanamycin resistance gene whose promoter 
has been replaced by the tetOP operator/promoter from 
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transposon TnlO. In the absence of tet repressor protein, E. 
coli RNA polymerase transcribes the kanamycin resistance 
gene firom the tetOP. If tet repressor is present, it binds to 
tetOP and blocks transcription of the kanamycin resistance 

5 gene. The other segment of pEZC726 has the tet repressor 
gene expressed by a constitutive promoter. Thus cells trans- 
formed by pEZC726 are resistant to chloramphenicol, 
because of the chloramphenicol acetyl transferase gene on 
the same segment as tetR, but are sensitive to kanamycin. 

10 The recombination reactions result in separation of the tetR 
gene firom the regulated kanamycin resistance gene. This 
separation results in kanamycin resistance in cells receiving 
the desired recombination Product. 
Two different sets of plasmids were constructed to dem- 

15 onstrate the in vitro method. One set, for use with Cre 
recombinase only (cloning vector 602 and subcloning vector 
629 (FIG. 3)) contained loxP and loxP 511 sites. A second 
set, for use with Cre and integrase (cloning vector 705 and 
subcloning vector 726 (FIG. 2)) contained loxP and att sites. 

20 The efficiency of production of the desired daughter plasmid 
was about 60 fold higher using both enzymes than using Cre 
alone. Nineteen of twenty four colonies from the Cre-only 
reaction contained the desired product, while thirty eight of 
thirty eight colonies firom the integrase plus Cre reaction 

25 contained the desired product plasmid. 

Other Selection Schemes A variety of selection schemes 
can be used that are known in the art as they can suit a 
particular purpose for which the recombinational cloning is 
carried out. Depending upon individual preferences and 
needs, a number of different types of selection schemes can 
be used in the recombinational cloning method of the 
present invention. The skilled artisan can take advantage of 
the availability of the many DNA segments or methods for 
making them and the different methods of selection that are 
routinely used in the art. Such DNA segments include but 
are not limited to those which encodes an activity such as, 
but not limited to, production of RNA, peptide, or protein, 
or providing a binding site for such RNA, peptide, or 
protein. Examples of DNA molecules used in devising a 

^ selection scheme are given above, under the definition of 
"selection scheme" 

Additional examples include but are not hmited to: 

(i) Generation of new primer sites for PGR (e.g., juxta- 
position of two DNA sequences that were not previ- 
ously juxtaposed); 

(ii) Inclusion of a DNA sequence acted upon by a restric- 
tion endonuclease or other DNA modifying enzyme, 
chemical, ribozyme, etc.; 

50 (iii) Inclusion of a DNA sequence recognized by a DNA 
binding protein, RNA, DNA, chemical, etc.) (e.g., for 
use as an afSnity tag for selecting for or excluding from 
a population) (Davis, Nucl. Acids Res. 24.:702-706 
(1996); J. Virol. 69. 8027-8034 (1995)); 

55 (iv) In vitro selection of RNA ligands for the ribosomal 
L22 protein associated with Epstein-Barr virus- 
expressed RNA by using randomized and cDNA- 
derived RNA libraries; 
(vi) The positioning of functional elements whose activity 

60 requires a specific orientation or juxtaposition (e.g, (a) 
a recombination site which reacts poorly in trans, but 
when placed in cis, in the presence of the appropriate 
proteins, results in recombination that destroys certain 
populations of molecules; (e.g., reconstitution of a 

65 promoter sequence that allows in vitro RNA synthesis). 
The RNA can be used directly, or can be reverse 
transcribed to obtain the desired DNA construct; 
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(vii) Selection of the desired product by size (e.g., 
fractionation) or other physical property of the 
molecule(s); and 

(viii) Inclusion of a DNA sequence required for a specific 
modification (e.g., methylation) that allows its identi- 
fication. 

After formation of the Product and Byproduct in the 
method of the present invention, the selection step can be 
carried out either in vitro or in vivo depending upon the 
particular selection scheme which has been optionally i 
devised in the particular recombinational cloning procedure. 

For example, an in vitro method of selection can be 
devised for the Insert Donor and Vector Donor DNA mol- 
ecules. Such scheme can involve engineering a rare restric- 
tion site in the starting circular vectors in such a way that i 
after the recombination events the rare cutting sites end up 
in the Byproduct. Hence, when the restriction enzyme which 
binds and cuts at the rare restriction site is added to the 
reaction mixture in vitro, all of the DNA molecules carrying 
the rare cutting site, i.e., the starting DNA molecules, the 2 
Cointegrate, and the Byproduct, will be cut and rendered 
nonreplicable in the intended host cell. For example, cutting 
sites in segments B and C (see FIG. 1) can be used to select 
against all molecules except the Product. Alternatively, only 
a cutting site in C is needed if one is able to select for 2 
segment D, e.g, by a drug resistance gene not found on B. 

Similarly, an in vitro selection method can be devised 
when dealing with linear DNA molecules. DNA sequences 
complementary to a PCR primer sequence can be so engi- 
neered that they are transferred, through the recombinational ^ 
cloning method, only to the Product molecule. After the 
reactions are completed, the appropriate primers are added 
to the reaction solution and the sample is subjected to PCR. 
Hence, all or part of the Product molecule is amplified. 

Other in vivo selection schemes can be used with a variety 3 
of E. coli cell lines. One is to put a repressor gene on one 
segment of the subcloning plasmid, and a drug marker 
controlled by that repressor on the other segment of the same 
plasmid. Another is to put a killer gene on segment C of the 
subcloning plasmid (FIG. 1). Of course a way must exist for ^ 
growing such a plasmid, i.e., there must exist circumstances 
under which the killer gene will not kill. There are a number 
of these genes known which require particular strains of E. 
coli. One such scheme is to use the restriction enzyme Dpnl, 
which will not cleave unless its recognition sequence GATC ^ 
is methylated. Many popular common E. coli strains methy- 
late GATC sequences, but there are mutants in which cloned 
Dpnl can be expressed without harm. 

Of course analogous selection schemes can be devised for 
other host organisms. For example, the tet repressor/operator 5 
of TnlO has been adapted to control gene expression in 
eukaryotes (Gossen, M., and Bujard, H., Proc. Natl. Acad. 
Sci. USA 89:5547-5551 (1992)). Thus the same control of 
drug resistance by the tet repressor exemplified herein can 
be applied to select for Product in eukaryotic cells. 5 
Recombination Proteins 

In the present invention, the exchange of DNA segments 
is achieved by the use of recombination proteins, including 
recombinases and associated co-factors and proteins. Vari- e 
ous recombination proteins are described in the art. 
Examples of such recombinases include: 

Cre: A protein from bacteriophage PI (Abremski and 
Hoess, J. Biol. Chem.259 (3):1509-1514 (1984)) catalyzes 
the exchange (i.e., causes recombination) between 34 bp 6 
DNA sequences called loxP (locus of crossover) sites (See 
Hoess et al, Nucl. Acids Res. 14(5):2287 (1986)). Cre is 



available commercially (Novagen, Catalog No. 69247-1). 
Recombination mediated by Cre is fi-eely reversible. From 
thermodynamic considerations it is not surprising that Cre- 
mediated integration (recombination between two molecules 
to form one molecule) is much less eflScient than Cre- 
mediated excision (recombination between two loxP sites in 
the same molecule to form two daughter molecules). Cre 
works in simple buffers with either magnesium or ;q)eraii- 
dine as a cofactor, as is well known in the art. The DNA 
substrates can be either linear or supercoiled. A number of 
mutant loxP sites have been described (Hoess et al., supra). 
One of these, loxP 511, recombines with another loxP 511 
site, but will not recombine with a loxP site. 

Integrase: A protein from bacteriophage lambda that 
mediates the integration of the lambda genome into the E. 
coli chromosome. The bacteriophage X Int recombinational 
proteins promote irreversible recombination between its 
substrate att sites as part of the formation or induction of a 
lysogenic state. Reversibility of the recombination reactions 
results from two independent pathways for integrative and 
excisive recombination. Each pathway uses a unique, but 
overlapping, set of the 15 protein binding sites that comprise 
att site DNAs. Cooperative and competitive interactions 
involving four proteins (Int, Xis, IHF and FIS) determine the 
direction of recombination. 

Integrative recombination involves the Int and IHF pro- 
teins and sites attP(240 bp) and attB (25 bp). Recombination 
results in the formation of two new sites: attL and attR. 
Excisive recombination requires Int, IHF, and Xis, and sites 
attL and attR to generate attP and aftB. Under certain 
conditions, FIS stimulates excisive recombination. In addi- 
tion to these normal reactions, it should be appreciated that 
attP and attB, when placed on the same molecule, can 
promote excisive recombination to generate two excision 
products, one with attL and one with attR. Similarly, inter- 
molecular recombination between molecules containing attL 
and attR, in the presence of Int, IHF and Xis, can result in 
integrative recombination and the generation attP and attB. 
Hence, by flanking DNA segments with appropriate com- 
binations of engineered att sites, in the presence of the 
appropriate recombination proteins, one can direct excisive 
or integrative recombination, as reverse reactions of each 

Each of the att sites contains a 15 bp core sequence; 
individual sequence elements of fiinctional significance he 
within, outside, and across the boundaries of this common 
core (Landy, A., Ann. Rev. Biochem. 58:913 (1989)). EfB- 
cient recombination between the various att sites requires 
that the sequence of the central common region be identical 
between the recombining partners, however, the exact 
sequence is now found to be modifiable. Consequently, 
derivatives of the att site with changes within the core are 
now discovered to recombine as least as eflSciently as the 
native core sequences. 

Integrase acts to recombine the attP site on bacteriophage 
lambda (about 240 bp) with the attB site on the E. coli 
genome (about 25 bp) (Weisberg, R. A. and Landy, A. in 
Lambda 11, p. 211 (1983), Cold Spring Harbor Laboratory)), 
to produce the integrated lambda genome flanked by attL 
(about 100 bp) and attR (about 160 bp) sites. In the absence 
of Xis (see below), this reaction is essentially irreversible. 
The integration reaction mediated by integrase and IHF 
works in vitro, with simple buffer containing spermidine. 
Integrase can be obtained as described by Nash, H. A., 
Methods of Enzymology 100:210-216 (1983). IHF can be 
obtained as described by Filutowicz, M., et al.. Gene 
147:149-150 (1994). 
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In the presence of the X protein Xis (excise) integrase 
catalyzes the reaction of attR and attL to form attP and attB, 
i.e., it promotes the reverse of the reaction described above. 
This reaction can also be applied in the present invention. 

Other Recombination Systems. Numerous recombination 
systems from various organisms can also be used, based on 
the teaching and guidance provided herein. See, e.g., Hoess 
et al, Nucleic Acids Research 14(6):2287 (1986); Abremski 
et al., J. Biol. Chem. 261(1):391 (1986); Campbell, J. 
Bacterial. 174(23):7495 (1992); Qian et al., J. Biol. Chem. 
267(11):7794 (1992); Araki et al., J. Mol. Biol. 225(1):25 
(1992)). Many of these belong to the integrase family of 
recombinases (Argos et al. EMBO J. 5:433-440 (1986)). 
Perhaps the best studied of these are the Integrase/att system 
from bacteriophage X. (Landy, A. (1993) Current Opinions in 
Genetics andDevel. 3:699-707), the Cre/loxP system from 
bacteriophage PI (Hoess and Abremski (1990) In Nucleic 
Acids and Molecular Biology, vol. 4. Eds.: Eckstein and 
Lilley, Berlin-Heidelberg: Springer- Verlag; pp. 90-109), 
and the FLP/FRT system from the Saccharomyces cerevisiae 
2 /« circle plasmid (Broach et al. Cell 29:227-234 (1982)). 

Members of a second family of site-specific 
recombinases, the resolvase family (e.g, yS, Tn3 resolvase, 
Hin, Gin, and Cin) are also known. Members of this highly 
related family of recombinases are typically constrained to 
intramolecular reactions (e.g., inversions and excisions) and 
can require host-encoded factors. Mutants have been iso- 
lated that relieve some of the requirements for host factors 
(Maeser and Kahnmann (1991) Mol. Gen. Genet. 
230: 170-176), as well as some of the constraints of intramo- 
lecular recombination. 

Other site-specific recombinases similar to X Int and 
similar to PlCre can be substituted for Int and Cre. Such 
recombinases are known. In many cases the purification of 
such other recombinases has been described in the art. In 
cases when they are not known, cell extracts can be used or 
the enzymes can be partially purified using procedures 
described for Cie and Int. 

While Cre and Int are described in detail for reasons of 
example, many related recombinase systems exist and their 
application to the described invention is also provided 
according to the present invention. The integrase family of 
site-specific recombinases can be used to provide alternative 
recombination proteins and recombination sites for the 
present invention, as site-specific recombination proteins 
encoded by bacteriophage lambda, phi 80, P22, P2, 186, P4 
and PI. TTiis group of proteins exhibits an unexpectedly 
large diversity of sequences. Despite this diversity, all of the 
recombinases can be aligned in their C-terminal halves. 

A 40-residue region near the C terminus is particularly 
well conserved in all the proteins and is homologous to a 
region near the C terminus of the yeast 2 mu plasmid Flp 
protein. Three positions are perfectly conserved within this 
family: histidine, arginine and tyrosine are found at respec- 
tive alignment positions 396, 399 and 433 within the well- 
conserved C-terminal region. These residues contribute to 
the active site of this family of recombinases, and suggest 
that tyrosine-433 forms a transient covalent linkage to DNA 
during strand cleavage and rejoining. See, e.g, Argos, P. et 
al., EMBO J. 5:433-40 (1986). 

Alternatively, IS231 and other Bacillus thuringieiKis 
transposable elements could be used as recombination pro- 
teins and recombination sites. Bacillus thuringiensis is an 
entomopathogenic bacterium whose toxicity is due to the 
presence in the sporangia of delta-endotoxin crystals active 
against agricultural pests and vectors of human and animal 
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diseases. Most of the genes coding for these toxin proteins 
are plasmid-borne and are generally structurally associated 
with insertion sequences (IS231, IS232, IS240, ISBTl and 
ISBT2) and transposons (Tn4430 and Tn5401). Several of 

5 these mobile elements have been shown to be active and 
participate in the crystal gene mobility, thereby contributing 
to the variation of bacterial toxicity. 

Structural analysis of the iso-lS231 elements indicates 
that they are related to IS1151 from Clostridium perfringens 

10 and distantly related to IS4 and IS186 from Escherichia coli. 
Like the other IS4 family members, they contain a conserved 
transposase-integrase motif found in other IS families and 
retroviruses. 

Moreover, functional data gathered from IS231A in 

Escherichia coli indicate a non-replicative mode of 
transposition, with a preference for specific targets. Similar 
results were also obtained in Bacillus subtilis and B. thur- 
ingiensis. See, e.g., Mahillon, J. et al., Genetica 93:13-26 
(1994); Campbell, J. Bacterial. 7495-7499 (1992). 

The amount of recombinase which is added to drive the 
recombination reaction can be determined by using known 
assays. Specifically, titration assay is used to determine the 
appropriate amount of a purified recombinase enzyme, or 
the appropriate amount of an extract. 

Engineered Recombination Sites. The above recombi- 
nases and corresponding recombinase sites are suitable for 
use in recombination cloning according to the present inven- 
tion. However, wild-type recombination sites contain 
sequences that reduce the efficiency or specificity of recom- 
bination reactions as applied in methods of the present 
invention. For example, multiple stop codons in attB, attR, 
attP, attL and loxP recombination sites occur in multiple 
reading frames on both strands, so recombination efScien- 
2j cies are reduced, e.g., where the coding sequence must cross 
the recombination sites, (only one reading frame is available 
on each strand of loxP and attB sites) or impossible (in attP, 
attR or attL). 

Accordingly, the present invention also provides engi- 

40 neered recombination sites that overcome these problems. 
For example, att sites can be engineered to have one or 
multiple mutations to enhance specificity or efEciency of the 
recombination reaction and the properties of Product DNAs 
(e.g, attl, att2, and att3 sites); to decrease reverse reaction 

45 (e.g., removing Pland HI firom attB). The testing of these 
mutants determines which mutants yield suflScient recom- 
binational activity to be suitable for recombination subclon- 
ing according to the present invention. 
Mutations can therefore be introduced into recombination 

50 sites for enhancing site specific recombination. Such muta- 
tions include, but are not limited to: recombination sites 
without translation stop codons that allow fusion proteins to 
be encoded; recombination sites recognized by the same 
proteins but differing in base sequence such that they react 

55 largely or exclusively with their homologous partners allow 
multiple reactions to be contemplated. Which particular 
reactions take place can be specified by which particular 
partners are present in the reaction mixture. For example, a 
tripartite protein fusion could be accompHshed with parental 

60 plasmids containing recombination sites attRl and attR2; 
attLl and attL3; and/or attR3 and attL2. 

There are well known procedures for introducing specific 
mutations into nucleic acid sequences. Anumber of these are 
described in Ausubel, F. M. et al., Cuirent Protocols in 

65 Molecular Biology, Wiley Interscience, New York 
(1989-1996). Mutations can be designed into 
oligonucleotides, which can be used to modify existing 
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cloned sequences, or in amplification reactions. Random 
mutagenesis can also be employed if appropriate selection 
methods are available to isolate the desired mutant DNA or 
RNA. The presence of the desired mutations can be con- 
firmed by sequencing the nucleic acid by well known 
methods. 

The following non-limiting methods can be used to engi- 
neer a core region of a given recombination site to provide 
mutated sites suitable for use in the present invention: 

1. By recombination of two parental DNA sequences by 
site-specific (e.g. attL and attR to give attB) or other 
(e.g. homologous) recombination mechanisms. The 
DNA parental DNA segments containing one or more 
base alterations resulting in the final core sequence; 

2. By mutation or mutagenesis (site-specific, PGR, 
random, spontaneous, etc) directly of the desired core 



3. By mutagenesis (site-specific, PGR, random, 

spontanteous, etc) of parental DNA sequences, which 
are recombined to generate a desired core sequence; 
and 

4. By reverse transcription of an RNA encoding the 
desired core sequence. 

The functionality of the mutant recombination sites can be 
demonstrated in ways that depend on the particular charac- 
teristic that is desired. For example, the lack of translation 
stop codons in a recombination site can be demonstrated by 
expressing the appropriate fusion proteins. Specificity of 
recombination between homologous partners can be dem- 
onstrated by introducing the appropriate molecules into in 
vitro reactions, and assaying for recombination products as 
described herein or known in the art. Other desired muta- 
tions in recombination sites might include the presence or 
absence of restriction sites, translation or transcription start 
signals, protein binding sites, and other known fiinctional- 
ities of nucleic acid base sequences. Genetic selection 
schemes for particular functional attributes in the recombi- 
nation sites can be used according to known method steps. 
For example, the modification of sites to provide (from a 
pair of sites that do not interact) partners that do interact 
could be achieved by requiring deletion, via recombination 
between the sites, of a DNA sequence encoding a toxic 
substance. Similarly, selection for sites that remove trans- 
lation stop sequences, the presence or absence of protein 
binding sites, etc., can be easily devised by those skilled in 
the art. 

Accordingly, the present invention provides a nucleic acid 
molecule, comprising at least one DNA segment having at 
least two engineered recombination sites flanking a Select- 
able marker and/or a desired DNA segment, wherein at least 
one of said recombination sites comprises a core region 
having at least one engineered mutation that enhances 
recombination in vitro in the formation of a Gointegrate 
DNA or a Product DNA. 

The nucleic acid molecule can have at least one mutation 
that confers at least one enhancement of said recombination, 
said enhancement selected from the group consisting of 
substantially (i) favoring excisive integration; (ii) favoring 
excisive recombination; (ii) relieving the requirement for 
host factors; (iii) increasing the eflSciency of said Gointe- 
grate DNA or Product DNA formation; and (iv) increasing 
the specificity of said Gointegrate DNA or Product DNA 
formation. 

The nucleic acid molecule preferably comprises at least 
one recombination site derived from attB, attP, attL or attR. 
More preferably the att site is selected fi-om attl, att2, or att3, 
as described herein. 



In a preferred embodiment, the core region comprises a 
DNA sequence selected from the group consisting of: 
(a) RKYCWGGTTFYKTRTAGNAASTSGB (m-att) 
(SEQ ID N0:1); 
^ (b) AGGGWGGTTTYKTRTACNAACTSGB (m-attB) 
(SEQ ID N0:2); 
(c) GTTGAGGTTTGKTRTAGNAAGTSGB (m-attR) 
(SEQ ID NO:3); 
10 (d) AGGGWGGTTTGKTRTAGNAAGTSGB (m-attL) 
(SEQ ID N0:4); 
(e) GTTGAGCTTTYKTRTACNAAGTSGB(m-attPl) 
(SEQ ID N0:5); 
or a corresponding or complementary DNA or RNA 
15 sequence, wherein R=A or G; K=G or TAJ; Y=G or TAJ; 
W=A or TAJ; N=A or G or G or TAJ; S=Gor G; and B=G or 
G or TAJ, as presented in 37 G.F.R. §1.822, which is entirely 
incorporated herein by reference, wherein the core region 
does not contain a stop codon in one or more reading frames. 
20 The core region also preferably comprises a DNA 
sequence selected from the group consisting of: 

(a) AGCCTGCTnnTGTAGAAAGTrGT(attBl) (SEQ 
ID N0:6); 

(b) AGCCTGCTTTGTTGTACAAACTTGT (attB2) 
25 (SEQ ID N0:7); 

(c) ACCCAGCTFTGTTGTAGAAAGTTGT (attB3) 
(SEQ ID N0:8); 

(d) GTTCAGGTTTGTAGAAAGTTGT (attRl) (SEQ ID 
30 N0:9); 

(e) GTTCAGCTTRCTTGTACAAACTTGT (attnR2) 
(SEQ ID NO; 10); 

(f) GTTCAGCTTTCTTGTACAAAGTTGG (attR3) 
(SEQ ID NO: 11); 

(g) AGCCTGCTTTTTTGTAGAAAGTTGG (attLl) 
(SEQ ID N0:12); 

(h) AGGCTGGTTTCTTGTACAAAGTTGG (attL2) 
(SEQ ID N0:13); 

40 (i) ACCCAGCTTTCTTGTACAAAGTTGG (attL3) 
(SEQ ID N0:14); 
(j) GTTCAGCTTTTTTGTACAAAGTTGG(altPl) (SEQ 
ID NO: 15); 

(k) GTTGAGGTTTGTTGTAGAAAGTTGG (attP2,P3) 

45 (SEQ ID N0:16); or a corresponding or complemen- 
tary DNA or RNA sequence. 
The present invention thus also provides a method for 
making a nucleic acid molecule, comprising providing a 
nucleic acid molecule having at least one engineered recom- 

50 bination site comprising at least one DNA sequence having 
at least 80-99% homology (or any range or value therein) to 
at least one of SEQ ID N0S:1-16, or any suitable recom- 
bination site, or which hybridizes under stringent conditions 
thereto, as known in the art. 

55 Glearly, there are various types and permutations of such 
well-known in vitro and in vivo selection methods, each of 
which are not described herein for the sake of brevity. 
However, such variations and permutations are contem- 
plated and considered to be the different embodiments of the 

60 present invention. 

It is important to note that as a result of the preferred 
embodiment being in vitro recombination reactions, non- 
biological molecules such as PGR products can be manipu- 
lated via the present recombinational cloning method. In one 

65 example, it is possible to clone linear molecules into circular 
vectors. There are a number of applications for the present 
invention. These uses include, but are not limited to, chang- 
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ing vectors, apposing promoters with genes, constructing 
genes for fusion proteins, changing copy number, changing 
replicons, cloning into phages, and cloning, e.g, PCR prod- 
ucts (with an attB site at one end and a loxP site at the other 
end), genomic DNAs, and cDNAs. 

The following examples are intended to further illustrate 
certain preferred embodiments of the invention and are not 
intended to be limiting in nature. 

EXAMPLES 

The present recombinational cloning method accom- 
plishes the exchange of nucleic acid segments to render 
something useful to the user, such as a change of cloning 
vectors. These segments must be flanked on both sides by 
recombination signals that are in the proper orientation with 
respect to one another. In the examples below the two 
parental nucleic acid molecules (e.g., plasmids) are called 
the Insert Donor and the Vector Donor. The Insert Donor 
contains a segment that will become joined to a new vector 
contributed by the Vector Donor. The recombination 
intermediate(s) that contain(s) both starting molecules is 
called the Cointegrate(s). The second recombination event 
produces two daughter molecules, called the Product (the 
desired new clone) and the Byproduct. 
Buffers 

Various known buffers can be used in the reactions of the 
present invention. For restriction enzymes, it is advisable to 
use the buffers recommended by the manufacturer. Alterna- 
tive buffers can be readily found in the literature or can be 
devised by those of ordinary skill in the art. 

Examples 1-3. One exemplary buffer for lambda inte- 
grase is comprised of 50 mM Tris-HCl, at pH 7.5-7.8, 70 
mM KCl, 5 mM spermidine, 0.5 mM EDTA, and 0.25 mg/ml 
bovine serum albumin, and optionally, 10% glycerol. 

One preferred buffer for PI Cre recombinase is comprised 
of 50 mM Tris-HCl at pH 7.5, 33 mM NaQ, 5 mM 
spermidine, and 0.5 mg/ml bovine serum albumin. 

The buffer for other site-specific recombinases which are 
similar to lambda Int and PI Cre are either known in the art 
or can be determined empirically by the skilled artisans, 
particularly in light of the above-described buffers. 

Example 1 

Recombinational Clomng Using Cre and Cre & Int 
Two pairs of plasmids were coiKtructed to do the in vitro 
recombinational cloning method in two different ways. One 
pair, pEZC705 and pEZC726 (FIG. 2A), was constructed 
with loxP and att sites, to be used with Cre and X integrase. 
The other pair, pEZC602 and pEZC629 (FIG. 3A), con- 
tained the loxP (wild type) site for Cre, and a second mutant 
lox site, loxP 511, which differs from loxP in one base (out 
of 34 total). The minimum requirement for recombinational 
cloning of the present invention is two recombination sites 
in each plasmid, in general X and Y. and X'and Y'. Recom- 
binational cloning takes place if either or both types of site 
can recombine to form a Cointegrate (e.g. X and X'), and if 
either or both (but necessarily a site different from the type 
forming the Cointegrate) can recombine to excise the Prod- 
uct and Byproduct plasmids from the Cointegrate (e.g. Y and 
Y'). It is important that the recombination sites on the same 
plasmid do not recombine. It was found that the present 
recombinational cloning could be done with Cre alone. 
Cre-Only 

Two plasmids were constructed to demonstrate this con- 
ception (see FIG. 3A). pEZC629 was the Vector Donor 
plasmid. It contained a constitutive drug marker 
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(chloramphenicol resistance), an origin of replication, loxP 
and loxP 511 sites, a conditional drug marker (kanamycin 
resistance whose expression is controlled by the operator/ 
promoter of the tetracycline resistance operon of transposon 

5 TnlO), and a constitutively expressed gene for the tet 
repressor protein, tetR. E. coli cells containing pEZC629 
were resistant to chloramphenicol at 30 fig/ml, but sensitive 
to kanamycin at 100 fig/ml. pEZC602 was the Insert Donor 
plasmid, which contained a different drug marker 

10 (ampicillin resistance), an origin, and loxP and loxP 511 
sites ilanking a multiple cloning site. 

This experiment was comprised of two parts as follows: 
Part I: About 75 ng each of pEZC602 and pEZC629 were 
mixed in a total volume of 30 fil of Cre buffer (50 mM 

15 Tris-HQ pH 7.5, 33 mM NaCl, 5 mM spermidine-HCl, 500 
fig/ml bovine serum albumin). Two 10 fi\ aliquots were 
transferred to new tubes. One tube received 0.5 fH of Cre 
protein (approx. 4 units per fA; partially purified according 
to Abremski and Hoess, J. Biol. Chem. 259:1509 (1984)). 

20 Both tubes were incubated at 37° C. for 30 minutes, then 70° 
C. for 10 minutes. Aliquots of each reaction were diluted and 
transformed into DH5a. Following expression, aliquots 
were plated on 30 fig/ml chloramphenicol; 100 fig/ml ampi- 
cillin plus 200 fig/ml methicillin; or 100 fig/ml kanamycin. 

25 Results: See Table 1. The reaction without Cre gave l.llx 
10^ ampicillin resistant colonies (from the Insert Donor 
plasmid pEZC602); 7.8x10^ chloramphenicol resistant colo- 
nies (from the Vector Donor plasmid pEZC629); and 140 
kanamycin resistant colonies (background). The reaction 

30 with added Cre gave 7.5x10* ampicillin resistant colonies 
(from the Insert Donor plasmid pEZC602); 6.1x10= 
chloramphenicol resistant colonies (from the Vector Donor 
plasmid pEZC629); and 760 kanamycin resistant colonies 
(mixture of background colonies and colonies from the 

35 recombinational cloning Product plasmid). Analysis: 
Because the number of colonies on the kanamycin plates 
was much higher in the presence of Cre, many or most of 
them were predicted to contain the desired ftoduct plasmid. 

40 TABLE 1 

Chloram- 

Enzyme Ampicillin plienicol Kanamycin EfEciency 

None 1.1 X 10" 7.8 x 10= 140 140/7.8 x 10= - 0.02% 

Cre 7.5 X 10= 6.1 x 10= 760 760/6.1 x 10= - 0.12% 



Part II: Twenty four colonies from the "+Cre" kanamycin 
plates were picked and inoculated into medium containing 
100 /<g/ml kanamycin. Minipieps were done, and the mini- 

50 prep DNAs, uncut or cut with Smalor Hindlll, were elec- 
trophoresed. Results: 19 of the 24 minipreps showed super- 
coiled plasmid of the size predicted for the Product plasmid. 
All 19 showed the predicted Smal and Hindlll restriction 
fragments. Analysis: The Cre only scheme was demon- 

55 strated. Specifically, it was determined to have yielded about 
70% (19 of 24) Product clones. The efSciency was about 
0.1% (760 kanamycin resistant clones resulted from 6.1x10* 
chloramphenicol resistant colonies). 
Cre Plus Integrase 

60 The plasmids used to demonstrate this method are exactly 
analogous to those used above, except that pEZC726, the 
Vector Donor plasmid, contained an attP site in place of loxP 
511, and pEZC705, the Insert Donor plasmid, contained an 
attB site in place of loxP 511 (FIG. 2A). 

65 This experiment was comprised of three parts as follows: 
Part I: About 500 ng of pEZC705 (the Insert Donor 
plasmid) was cut with Seal, which linearized the plasmid 
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within the ampicillin resistance gene. (This was done 
because the X integrase reaction has been historically done 
with the attB plasmid in a linear state (H. Nash, personal 
communication). However, it was found later that the inte- 
grase reaction proceeds well with both plasmids 
supercoiled.) Then, the hnear plasmid was ethanol precipi- 
tated and dissolved in 20 fi\ of X integrase buffer (50 mM 
Tris-HCl, about pH 7.8, 70 mM KCl, 5 mM spermidine- 
HCl, 0.5 mM EDTA, 250 fig/ml bovine serum albumin). 
Also, about 500 ng of the Vector Donor plasmid pEZC726 
was ethanol precipitated and dissolved in 20 ;tl X integrase 
buffer. Just before use, X integrase (2 /d, 393 /ig/ml) was 
thawed and diluted by adding 18 /A cold X integrase buffer. 
One /.il IHF (integration host factor, 2.4 mg/ml, an accessory 
protein) was diluted into 150 /A cold X integrase buffer. 
Aliquots (2 jul} of each DNA were mixed with X integrase 
buffer, with or without 1 /A each X integrase and IHF, in a 
total of 10 jul The mixture was incubated at 25° C. for 45 
minutes, then at 70° C. for 10 minutes. Half of each reaction 
was applied to an agarose gel. Results: In the presence of 
integrase and IHF, about 5% of the total DNA was converted 
to a linear Cointegrate form. Analysis: Activity of integrase 
and IHF was confirmed. 

Part II: Three microliters of each reaction (i.e., with or 
without integrase and IHF) were diluted into 27 fA of Cre 
buffer (above), then each reaction was spht into two 10 fil 
ahquots (four altogether). To two of these reactions, 0.5 fd 
of Cre protein (above) were added, and all reactions were 
incubated at 37° C. for 30 minutes, then at 70° C. for 10 
minutes. TE buffer (90 /A; TE: 10 mM Tris-HCI, pH 7.5, 1 
mM EDTA) was added to each reaction, and 1 /d each was 
transformed into£. coli DH5a. The transfoimation mixtures 
were plated on 100 /ig/ml ampicillin plus 200 /ig/ml methi- 
cillin; 30/<g/ml chloramphenicol; or 100 ^ml kanamycin. 
Results: See Table 2. 

TABLE 2 



Ampicfllin phenicol Kanamydn 



EfEciency 



None 

Integrase* 



20000 
3640 
27000 



4/2 X Iff" = 0.02% 
0 

9/2.7 X 10* = 0.03% 
76/1.1 X 10= = 6.9% 



•Integrase 



IS also contained IHF. 



Analysis: The Cre protein impaired transformation. When 
adjusted for this effect, the number of kanamycin resistant 
colonies, compared to the control reactions, increased more 
than 100 fold when both Cre and Integrase were used. This 
suggests a specificity of greater than 99%. 

Part III: 38 colonies were picked from the Integrase plus 
Cre plates, miniprep DNAs were made and cut with Hindlll 
to give diagnostic mapping information. Result: All 38 had 
precisely the expected fragment sizes. Analysis: The Cre 
plus X integrase method was observed to have much higher 
specificity than Cre-alone. Conclusion: The Cre plus X 
integrase method was demonstrated. Efficiency and speci- 
ficity were much higher than for Cre only. 
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Example 2 

Using in vitro Recombinational Cloning to 
Subclone the Chloramphenicol Acetyl Transferase 
5 Gene into a Vector for Expression in Eukaryotic 
Cells (FIG. 4A) 

An Insert Donor plasmid, pEZC843, was constructed, 
comprising the chloramphenicol acetyl transferase gene of 
E. coli, cloned between loxP and attB sites such that the loxP 
site was positioned at the 5'-end of the gene (FIG. 4B). A 
Vector Donor plasmid, pEZC1003, was constructed, which 
contained the cytomegalovirus eukaryotic promoter apposed 
to a loxP site (FIG. 4C). One microliter ahquots of each 
supercoiled plasmid (about 50 ng crude miniprep DNA) 

"'^ were combined in a ten microliter reaction containing equal 
parts of lambda integrase buffer (50 mM Tris-HCl, pH 7.8, 
70 mM KCl, 5 mM spermidine, 0.5 mM EDTA, 0.25 mg/ml 
bovine serum albumin) and Cre recombinase buffer (50 mM 
Tris-HCl, pH 7.5, 33 mM NaCI, 5 mM spermidine, 0.5 
mg/ml bovine serum albumin), two units of Cre 
recombinase, 16 ng integration host factor, and 32 ng 
lambda integrase. After incubation at 30° C. for 30 minutes 
and 75° C. for 10 minutes, one microhter was transformed 
into competent E. coli strain DH5a (Life Technologies, 
Inc.). Aliquots of transformations were spread on agar plates 
containing 200 /ig/ml kanamycin and incubated at 37° C. 
overnight. An otherwise identical control reaction contained 
the Vector Donor plasmid only. The plate receiving 10% of 
the control reaction transformation gave one colony; the 

■"^ plate receiving 10% of the recombinational cloning reaction 
gave 144 colonies. These numbers suggested that greater 
than 99% of the recombinational cloning colonies contained 
the desired product plasmid. Miniprep DNA made from six 
recombinational cloning colonies gave the predicted size 
plasmid (5026 base pairs), CMVProd. Restriction digestion 
with Ncol gave the fragments predicted for the chloram- 
phenicol acetyl transferase cloned downstream of the CMV 
promoter for all six plasmids. 

40 Example 3 

Subcloned DNA Segments Flanked by attB Sites 
Without Stop Codons 
Part I: Backgrotmd 

45 The above examples are suitable for transcriptional 
fusions, in which transcription crosses recombination sites. 
However, both attR and loxP sites contain multiple stop 
codons on both strands, so translational fusions can be 
difiScuIt, where the coding sequence must cross the recom- 

50 bination sites, (only one reading frame is available on each 
strand of loxP sites) or impossible (in attR or attL). 

A principal reason for subcloning is to fiise protein 
domains. For example, fusion of the glutathione 
S-transferase (GST) domain to a protein of interest allows 

55 the fusion protein to be purified by aflSnity chromatography 
on glutathione agarose (Pharmacia, Inc., 1995 catalog). If 
the protein of interest is fused to runs of consecutive 
histidines (for example His6), the fusion protein can be 
purified by afEnity chromatography on chelating resins 

60 containing metal ions (Qiagen, Inc.). It is often desirable to 
compare amino terminal and carboxy terminal fusions for 
activity, solubihty, stabihty, and the Hke. 

The attB sites of the bacteriophage X integration system 
were examined as an alternative to loxP sites, because they 

65 are small (25 bp) and have some sequence flexibility (Nash, 
H. A. et al., Proc. Natl. Acad. Sci. USA 84:4049-4053 
(1987). It was not previously suggested that multiple muta- 
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tions to remove all stop codes would result in useful recom- 
bination sites for recombinational subcloning. 

Using standard nomenclature for site specific recombina- 
tion in lambda bacteriophage (Weisber, in Lambda III, 
Hendrix, et al., eds., Cold Spring Harbor Laboratory, Cold 
Spring Harbor, N.Y. (1989)), the nucleotide regions that 
participate in the recombination reaction in an E. coU host 
cell are represented as follows: 

altP — PI— HI— P2— X-«2— C— O— C— H'— P'l^'2— P'3— 



Inl,IHF |tXis,Int,IHF 
attR ^1— HI— P2— X— H2— C— O— B'— 



attL — B— O— C— H'— P'l— P'2— P'3— , 

where: O represents the 15 bp core DNA sequence found in 
both the phage and E. coU genomes; B and B' represent 
approximately 5 bases adjacent to the core in the E. coli 
genome; and PI, HI, P2, X, H2, C, C, H', P'l, P2, and P'3 
represent known DNA sequences encoding protein binding 
domains in the bacteriophage X genome. 

The reaction is reversible in the presence of the protein 
Xis (excisionase); recombination between attL, and attR 
precisely excise the X genome from its integrated state, 
regenerating the circular "k genome containing attp and the 
linear E. coli genome containing attB. 
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Part II: Construction and Testing of Plasmids Containing 
Mutant att Sites 

Mutant attL and attR sites were constructed. Importantly, 

^ Landy et al. (Ann. Rev. Biochem. 58:913 (1989)) observed 
that deletion of the Pland HI domains of attP facilitated the 
excision reaction and eliminated the integration reaction, 
thereby making the excision reaction irreversible. Therefore, 
as mutations were introduced in attR, the Pland HI domains 

10 were also deleted. attR sites in the present example lack the 
Pland HI regions and have the Ndel site removed (base 
27630 changed from C to G), and contain sequences corre- 
sponding to bacteriophage X coordinates 27619-27738 
(GenBank release 92.0, bg:LAMCG, "Complete Sequence 

15 of Bacteriophage Lambda"). 

The sequence of attB produced by recombination of wild 
type attL and attR sites is: 

BOB' 

20 

attBwt: 5' AGCCT GCTrTTTTATACTAA CTTGA 3' (SEQ. ID 
NO:31) 

3' TCGGA CGAAAAAATATGA1T GAACTS' (SEQ. ID 
NO:32) 

25 

The stop codons are italicized and underhned. Note that 
sequences of attL, attR, and attP can be derived from the attB 
sequence and the boundaries of bacteriophage X contained 
30 within attL and attR (coordinates 27619 to 27818). 

When mutant attRl and attLl sites were recombined the 
sequence attBl was produced (mutations in bold, large font): 



BOB' 
attBl: 5' AGCCT GCTXmTGTACAAA CTTOT 3 (SEQ. ID NO:6) 
3' TCGGA CGAAAAAACATGTrr GAACA 5' (SEQ. ID NO:33) 



Note that the four stop codons are gone. 

When an additional mutation was introduced in the attRl 
and attLl sequences ^old), attR2 and attL2 sites resulted. 
Recombination of attR2 and attL2 produced the attB2 site: 



BOB' 
attB2-. 5' AGCCT GCTrTCTrGTACAAA CTTGT 3' (SEQ. ID NO:7) 
3' TCGGA CGAAAGAACATGTTT GAACA S' (SEQ. ID NO:34) 



The recombination activities of the above attL and attR 
sites were assayed as follows. The attB site of plasmid 
pEZC705 (FIG. 2B) was replaced with attLwt, attLl, or 
attL2. The attP site of plasmid pEZC726 (FIG. 2C) was 
55 replaced with attRwt (lacking regions Pland HI), attRl, or 
attR2. Thus, the resulting plasmids could recombine via 
their loxP sites, mediated by Cre, and via their attR and attL 
sites, mediated by Int, Xis, and IHF. Pairs of plasmids were 
mixed and reacted with Cre, Int, Xis, and IHF, transformed 
into E. coli competent cells, and plated on agar containing 
kanamycin. The results are presented in Table 3: 
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TABLES 



attRwl (pEZC1301) 



attLwt (pEZC1313) 
attLl (pEZC1317) 
atlL2 (pEZC1321) 
None 

attLwt (pEZC1313) 
altLl (pEZC1317) 
atlL2 (pEZC1321) 

attLwt (pEZC1313) 
attL2 (pEZC1317) 
attL2 (pEZC1321) 



The above data show that whereas the wild type att and 
attl sites recombine to a small extent, the attl and att2 sites 
do not recombine detectably with each other. 

Part III. Recombination was demonstrated when the core 
region of both attb sites flanking the DNA segment of 
interest did not contain stop codons. The physical state of the 
participating plasmids was discovered to influence recom- 
bination efBciency. 

The appropriate att sites were moved into pEZC705 and 
pEZC726 to make the plasmids pEZC1405 (FIG. 5G) (attRl 
and attR2) and pEZC1502 (FIG. 5H) (attLl and attL2). The 
desired DNA segment in this experiment was a copy of the 
chloramphenicol resistance gene cloned between the two 
attL sites of pEZC1502. Pairs of plasmids were recombined 
in vitro using Int, Xis, and IHF (no Ore because no loxP sites 
were present). The yield of desired kanamydn resistant 
colonies was determined when both parental plasmids were 
circular, or when one plasmid was circular and the other 
linear as presented in Table 4: 

TABLE 4 

Vector donor' Gene donor' Kanamycin resistant colonies^ 

Circular pEZC1405 None 30 

Circular pEZC140S arcular pEZCaS02 2680 

Linear pEZC1405 None 90 

Linear pEZC1405 areular pEZC1502 172000 

Circular pEZC1405 Linear pEZCl 502 73000 



as used to transforms. coUDHSa 
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Example 4 

Demonstration of Recombinational Cloning 
Without Inverted Repeats 

Part I. Rationale 

5 The above Example 3 showed that plasmids containing 
inverted repeats of the appropriate recombination sites (for 
example, attLl and attL2 in plasmid pEZC1502) (FIG. 5H) 
could recombine to give the desired DNA segment flanked 
by attB sites without stop codons, also in inverted orienta- 

i'' tion. A concern was the in vivo and in vitro influence of the 
inverted repeats. For example, transcription of a desired 
DNA segment flanked by attB sites in inverted orientation 
could yield a single stranded RNA molecule that might form 
a hairpin structure, thereby inhibiting translation. 

15 Inverted orientation of similar recombination sites can be 
avoided by placing the sites in direct repeat arrangement att 
sites. If parental plasmids each have a wild type attL and 
wild type attR site, in direct repeat the Int, Xis, and IHF 
proteins will simply remove the DNA segment flanked by 

20 those sites in an intramolecular reaction. However, the 
mutant sites described in the above Example 3 suggested 
that it might be possible to inhibit the intramolecular reac- 
tion while allowing the intennolecular recombination to 
proceed as desired. 

25 Part II: Structure of Plasmids Without Inverted Repeats for 
Recombinational Cloning 

The attR2 sequence in plasmid pEZC1405 (FIG. 5G) was 
replaced with attL2, in the opposite orientation, to make 
pEZC1603 (FIG. 6A). The attL2 sequence of pEZC1502 

30 (FIG. 5H) was replaced with attR2, in the opposite 
orientation, to make pEZC1706 (FIG. 6B). Each of these 
plasmids contained mutations in the core region that make 
intramolecular reactions between attl and att2 cores very 
inefficient (see Example 3, above). 

35 Plasmids pEZC1405, pEZC1502, pEZC1603 and 
pEZC1706 were purified on Qiagen columns (Qiagen, Inc.). 
Aliquots of plasmids pEZC1405 and pEZC1603 were lin- 
earized with Xba I. Aliquots of plasmids pEZC1502 and 
pEZC1706 were linearized with AlwN I. One hundred ng of 

''0 plasmids were mixed in buffer (equal volumes of 50 mM 
Tris HCl pH 7.5, 25 mM Tris HCl pH 8.0, 70 mM KCl, 5 
mM spermidine, 0.5 mM EDTA, 250 /*g/ml BSA, 10% 
glycerol) containing Int (43.5 ng), Xis (4.3 ng) and IHF (8.1 
ng) in a final volume of 10^1. Reactions were incubated for 

45 45 minutes at 25° C, 10 minutes at 65 ° C, and 1 /ul was 
transformed into E. coli DH5a. After expression, aliquots 
were spread on agar plates containing 200 1/tg/ml kanamy- 
cin and incubated at 37° C. 
Results, expressed as the number of colonies per 1 /A of 

50 recombination reaction are presented in Table 5: 

TABLES 

Vector Donor Gene Donor Colonies Predicted % product 



Analysis: Recombinational cloning using mutant attR and 
attL sites was confirmed. The desired DNA segment is ^ 
subcloned between attB sites that do not contain any stop 
codons in either strand. The enhanced yield of Product DNA 
(when one. parent was linear) was unexpected because of 
earlier observations that the excision reaction was more 
efficient when both participating molecules were super- e 
coiled and proteins were limiting (Nunes-Duby et al.. Cell 
50:779-788 (1987). 



arcular 1405 Orcular 1502 



Circular 1502 



3640/3740 = 97% 



•1603 Circular 1 



Analysis. In aU configurations, i.e., circular or linear, the 
pEZC1405xpEZC1502 pair (with att sites in inverted repeat 
configuration) . was more efficient than pEZC1603x 
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pEZC1706 pair (with att sites mutated to avoid hairpin 
formation). The pEZC1603xpEZC1706 pair gave higher 
backgrounds and lower eflSciencies than the pEZC1405x 
pEZC1502 pair. While less efficient, 80% or more of the 
colonies from the pEZC1603xpEZC1706 reactions were 
ejcpected to contain the desired plasmid product. Making 
one partner linear stimulated the reactions in all cases. 
Part III: Confirmation of Product Plasmids' Structure 

Six colonies each from the linear pEZC1405 (FIG. 5G)x 
circular pEZC1502 (FIG. 5H), circular pEZC1405xlinear 
pEZC1502, linear pEZC1603 (FIG. 6A)xcircular 
pEZC1706 (FIG. 6B), and circular pEZC1603xlinear 
pEZC1706 reactions were picked into rich medium and 
miniprep DNAs were prepared. Diagnostic cuts with Ssp I 
gave the predicted restriction fragments for all 24 colonies. 

Analysis. Recombination reactions between plasmids 
with mutant attL and attR sites on the same molecules gave 
the desired plasmid products with a high degree of speci- 
ficity. 

Example 5 

Recombinational Cloning with a Toxic Gene 
Part 1: Background 

Restriction enzyme Dpn I recognizes the sequence GATC 
and cuts that sequence only if the A is methylated by the dam 
methylase. Most commonly used E. co/i strains are dam*. 
Expression of Dpn I in dam* strains of E. coli is lethal 
because the chromosome of the cell is chopped into many 
pieces. However, in dam" cells expression of Dpn I is 
innocuous because the chromosome is immune to Dpn I 
cutting. 

In the general recombinational cloning scheme, in which 
the vector donor contains two segments C and D separated 
by recombination sites, selection for the desired product 
depends upon selection for the presence of segment D, and 
the absence of segment C. In the original Example segment 
D contained a drug resistance gene (Km) that was negatively 
controlled by a repressor gene found on segment C. When C 
was present, cells containing D were not resistant to kana- 
mycin because the resistance gene was turned off. 

The Dpn 1 gene is an example of a toxic gene that can 
replace the repressor gene of the above embodiment. If 
segment C expresses the Dpn I gene product, transforming 
plasmid CD into a dam* host kills the cell. If segment D is 
transferred to a new plasmid, for example by recombina- 
tional cloning, then selecting for the drug marker will be 
successful because the toxic gene is no longer present. 
Part II: Construction of a Vector Donor Using Dpn I as a 
Toxic Gene 

The gene encoding Dpn I endonuclease was amplified by 
PCR using primers 5'CCA CCA CAA ACG CGT CCATGG 
AAT TAC ACT TTA ATT TAG3' (SEQ. ID NO: 17) and 
5'CCA CCA CAA GTC GAC GCA TGC CGA CAG CCT 
TCC AAATGT3' (SEQ. ID NO:18) and a plasmid contain- 
ing the Dpn I gene (derived from plasmids obtained from 
Sanford A. Lacks, Brookhaven National Laboratory, Upton, 
N.Y.; also available from American Type Culture Collection 
as ATCC 67494) as the template. 

Additional mutations were introduced into the B and B' 
regions of attL and attR, respectively, by ampHfying existing 
attL and attR domains with primers containing the desired 
base changes. Recombination of the mutant attL3 (made 
vath oligo XisllS) and attR3 (made with oligo Xisll2) 
yielded attB3 with the following sequence (differences from 
attBl in bold): 



The attL3 sequence was cloned in place of attL2 of an 
existing Gene Donor plasmid to give the plasmid pEZC2901 
(FIG. 7A). The attRS sequence was cloned in place of attR2 

jQ in an existing Vector Donor plasmid to give plasmid 
pEZC2913 (FIG. 7B) Dpn I gene was cloned into plasmid 
pEZC2913 to replace the tet repressor gene. The resulting 
Vector Donor plasmid was named pEZC3101 (FIG. 7C). 
When pEZC3101 was transformed into the dam" strain 

j5 SCSllO (Stratagene), hundreds of colonies resulted. When 
the same plasmid was transformed into the dam+ strain 
DH5a, only one colony was produced, even though the 
DH5a cells were about 20 fold more competent than the 
SCSllO cells. When a related plasmid that did not contain 

2Q the Dpn I gene was transformed into the same two cell lines, 
28 colonies were produced from the SCSllO cells, while 448 
colonies resulted from the DH5a cells. This is evidence that 
the Dpn I gene is being expressed on plasmid pEZC3101 
(FIG. 7C), and that it is killing the dam* DH5a cells but not 

25 the dam" SCSllO cells. 

Part III: Demonstration of Recombinational Cloning Using 
Dpn I Selection 

A pair of plasmids was used to demonstrate recombina- 
tional cloning with selection for product dependent upon the 

30 toxic gene Dpn 1. Plasmid pEZC3101 (FIG. 7C) was 
linearized with Mlu I and reacted with circular plasmid 
pEZC2901 (FIG. 7A). A second pair of plasmids using 
selection based on control of drug resistance by a repressor 
gene was used as a control: plasmid pEZC1802 (FIG. 7D) 

35 was linearized with Xba I and reacted with circular plasmid 
pEZC1502 (FIG. 5H). Eight microUter reactions containing 
the same buffer and proteins Xis, Int, and IHF as in previous 
examples were incubated for 45 minutes at 25° C, then 10 
minutes at 75° C, and 1 n\ aliquots were transformed into 
DH5a (i.e., dam+) competent cells, as presented in Table 6. 



TABLE 6 



1 pEZC3101/Mlu Dpn I toxicity — 3 

2 pEZOlOl/Mlu Dpn I toxicity Circular pEZC2901 4000 

3 pEZC1802/Xba Tet repressor — 0 

4 pEZC1802/Xba Tet repressor Circular pEZClS02 12100 



Miniprep DNAs were prepared from four colonies from 
reaction #2, and cut with restriction enzyme Ssp I. All gave 
the predicted fragments. 

Analysis: Subcloning using selection with a toxic gene 
was demonstrated. Plasmids of the predicted structure were 
produced. 

Example 6 

Cloning of Genes with Uracil DNA Glycosylase 
60 and Subcloning of the Genes with Recombinational 
Cloning to Make Fusion Proteins 
Part I: Converting an Existing Expression Vector to a Vector 
Donor for Recombinational Cloning 
A cassette useful for converting existing vectors into 
65 functional Vector Donors was made as follows. Plasmid 
pEZC3101 (FIG. 7C) was digested with Apa I and Kpn I, 
treated with T4 DNA polymerase and dNTPs to render the 
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ends blunt, further digested with Sma I, Hpa I, and AlwN I 
to render the undesirable DNA fragments small, and the 2.6 
kb cassette containing the attRl-Cm^-Dpn I-attR-3 domains 
was gel purified. The concentration of the purified cassette 
was estimated to be about 75 ng DNA/fil. ; 

Plasmid pGEX-2TK (FIG. 8A) (Pharmacia) allows 
fusions between the protein glutathione S transferase and 
any second coding sequence that can be inserted in its 
multiple cloning site. pGEX-2TK DNA was digested with 
Sma I and treated with alkaline phosphatase. About 75 ng of i 
the above purified DNA cassette was ligated with about 100 
ng of the pGEX-2TK vector for 2.5 hours inaSfil ligation, 
then 1 /A was transformed into competent BRL 3056 cells (a 
dam" derivative of DHIOB; dam" strains commercially 
available include DMl from Life Technologies, Inc., and i 
SCS 110 from Stratagene). Aliquots of the transformation 
mixture were plated on LB agar containing 100 ^g/ml 
ampicillin (resistance gene present on pGEX-2TK) and 30 
/«g/ml chloramphenicol (resistance gene present on the DNA 
cassette). Colonies were picked and miniprep DNAs were 2 
made. The orientation of the cassette in pGEX-2TK was 
determined by diagnostic cuts with EcoR I. A plasmid with 
the desired orientation was named pEZCC3501 (FIG. 8B). 
Part II: Cloning Reporter Genes Into an Recombinational 
Cloning Gene Donor Plasmid in Three Reading Frames 2 

Uracil DNA glycosylase (UDG) cloning is a method for 
cloning PCR amplification products into cloning vectors 
(U.S. Pat. No. 5,334,515, entirely incorporated herein by 
reference). Briefly, PCR amplification of the desired DNA 
segment is performed with primers that contain uracil bases 3 
in place of thymidine bases in their 5' ends. When such PCR 
products are incubated with the enzyme UDG, the uracil 
bases are specifically removed. The loss of Uiese bases 
weakens base pairing in the ends of the PCR product DNA, 
and when incubated at a suitable temperature (e.g., 37° C), 3 
the ends of such products are largely single stranded. If such 
incubations are done in the presence of linear cloning 
vectors containing protruding 3' tails that are complemen- 
tary to the 3' ends of the PCR products, base pairing 
eflSciently anneals the PCR products to the cloning vector. 4 
When the annealed product is introduced into E. coli cells by 
transformation, in vivo processes eflSciently convert it into a 
recombinant plasmid. 

UDG cloning vectors that enable cloning of any PCR 
product in all three reading frames were prepared from 4 
pEZC3201 (FIG. 8K) as follows. Eight oligonucleotides 
were obtained from Life Technologies, Inc. (all written 
5'^3': rfl top (GGCC GATTAC GATATC CCAACG ACC 
GAA AAC CTG TAT TTT CAG GCT) (SEQ. ID NO: 19), 
rfl bottom (CAG GTT,RTC GGT CGTTGG GAT AIC GTA 5 
ATC)(SEQ. ID NO:20), rf2 top (GGCCA GAT TAC GAT 
ATC CCA ACG ACC GAA AAC CTG TAT TTT CAG 
GGT)(SEQ. ID N0:21), rf2 bottom (CAG GTT TTC GGT 
CCTTGG GATATC GTAATC T)(SEQ. ID NO:22), rf3 top 
(GGCCAA GAT TAC GAT ATC CCA ACG ACC GAA 5 
AAC CTG TAT TTT CAG GGT)(SEQ. ID NO:23), rf3 
bottom (CAGCGTT TTC GGT CGT TGG GAT AIC GTA 
ATC TT)(SEQ. ID NO:24), carboxy top (ACC GTT TAC 
GTC GAC)(SEQ. ID NO:25) and carboxy bottom (TCGA 
GTC CAC GTA AAC GGTTCC CAC TTATTA)(SEQ. ID 6 
NO:26). The rfl, 2, and 3 top strands and the carboxy bottom 
strand were phosphorylated on their 5' ends with T4 poly- 
nucleotide kinase, and then the complementary strands of 
each pair were hybridized. Plasmid pEZC3201 (FIG. 8K) 
was cut with Not I and Sal I, and aliquots of cut plasmid 6 
were mixed with the carboxy-oligo duplex (Sal I end) and 
either the rfl, rf2, or r£3 duplexes (Not I ends) (10 fig cut 



plasmid (about 5 pmol) mixed with 250 pmol carboxy oligo 
duplex, split into three 20^1 volumes, added 5 fA (250 pmol) 
of rfl, rf2, or rf3 duplex and 2 fA=2 units T4 DNA ligase to 
each reaction). After 90 minutes of ligation at room 
temperature, each reaction was applied to a preparative 
agarose gel and the 2.1 kb vector bands were eluted and 
dissolved in 50 /A of TE. 
Part III: PCR of CAT and phoA Genes 

Primers were obtained from Life Technologies, Inc., to 
amplify the chloramphenicol acetyl transferase (CAT) gene 
from plasmid pACYC184, and phoA, the alkaline phos- 
phatase gene from E. coli. The primers had 12-base 5' 
extensions containing uracil bases, so that treatment of PCR 
products with uracil DNA glycosylase (UDG) would 
weaken base pairing at each end of the DNAs and allow the 
3' strands to anneal with the protruding 3' ends of the rfl, 2, 
and 3 vectors described above. The sequences of the primers 
(all written 5'-*3') were: CAT left, UAU UUU CAG GGU 
ATG GAG AAA AAA ATC ACT GGATATACC (SEQ. ID 
NO:27); CAT right, UCC CAC UUA UUA CGC CCC GCC 
CTG CCA CTC ATC (SEQ. ID NO:28); phoA left, UAU 
UUU CAG GGU ATG CCT GTT CTG GAA AAC CGG 
(SEQ. ID NO:29); and phoA right, UCC CAC UUA UUA 
TTT CAG CCC CAG GGC GGC TTT C (SEQ. ID NQ:30). 
The primers were then used for PCR reactions using known 
method steps (see, e.g., U.S. Pat. No. 5,334,515, entirely 
incorporated herein by reference), and the polymerase chain 
reaction amphfication products obtained with these primers 
comprised the CAT or phoA genes with the initiating ATGs 
but without any transcriptional signals. In addition, the 
uracil-containing sequences on the amino termini encoded 
the cleavage site for TEV protease (Life Technologies, Inc.), 
and those on the carboxy terminal encoded consecutive TAA 
nonsense codons. 

Unpurified PCR products (about 30 ng) were mixed with 
the gel purified, linear rfl, rf2, or rf3 cloning vectors (about 
50 ng) in a 10^1 reaction containing Ix REact 4 buffer (LTI) 
and 1 unit UDG (LTI). After 30 minutes at 37° C, 1 ^1 
aliquots of each reaction were transformed into competent E. 
coli DH5a cells (LTI) and plated on agar containing 50 
^g/ml kanamycin. Colonies were picked and analysis of 
miniprep DNA showed that the CAT gene had been cloned 
in reading frame 1 (pEZC3601)(FIG. 9C), reading frame 2 
(pEZC3609)(FlG. 8D) and reading frame 3 (pEZC3617) 
(FIG. 8E), and that the phoA gene had been cloned in 
reading frame 1 (pEZC3606)(FIG. 8F), reading frame 2 
(pEZC3613)(FIG. 8G) and reading frame 3 (pEZC3621) 
(FIG. 8H). 

Part IV: Subhloning of CAT or phoA from UDG Qoning 
Vectors into a GST Fusion Vector 

Plasmids encoding fusions between GST and either CAT 
or phoA in all three reading frames were constructed by 
recombinational cloning as follows. Miniprep DNA of GST 
vector donor pEZC3501(FIG. 8B) (derived from Pharmacia 
plasmid pGEX-2TK as described above) was linearized with 
Cla I. About 5 ng of vector donor were mixed with about 10 
ng each of the appropriate circular gene donor vectors 
containing CAT or phoA in 8 ^1 reactions containing buffer 
and recombination proteins Int, Xis, and IHF (above). After 
incubation, 1 /il of each reaction was transformed into£. coli 
strain DH5a and plated on ampicillin, as presented in Table 
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with sample buffer (containing SDS and P-mercaptoethanol) 
TABLE 7 and aliquots equivalent to about 0.15 A600 units of cells 

were applied to a Novex 4-20% gradient polyacrylamide 
'^'"'"^^(i''^"^^ gel. Following electrophoresis the gel was stained with 

DNA each transformation) % • 

' 5 Coomassie blue. 

Linear vector donor (pEZC350l/Cla) 0 Results: Enhanced expression of single protein bands was 

Vector donor + CAT rfi 110 seen for all 12 cultures. The observed sizes of these proteins 

VectoJ Zlor I CAT 'xB 148 correlated well with the sizes predicted for GST being fused 

Vector donor + phoA rO 121 (through attB recombination sites without stop codons) to 

Vector donor + phoA r£2 128 10 CAT or phoA in three reading frames: CAT rf 1=269 amino 

Vector donor + phoArf3 31 acids; CAT rf2=303 amino acids; CAT rf3=478 amino acids; 

phoA rfl=282 amino acids; phoA rf2=280 amino acids; and 

Part V: Expression of Fusion Proteins phoA rfJ=705 amino acids. 

Two colonies from each transformation were picked into Analysis: Both CAT and phoA genes were subcloned into 

2 ml of rich medium (CircIeGrow, BiolOl Inc.) in 17x100 15 a GST fusion vector in all three reading frames, and expres- 

mm plastic tubes (Falcon 2059, Becton Dickinson) contain- sion of the six fusion proteins was demonstrated, 

ing 100 fig/ml ampicillin and shaken vigorously for about 4 While the foregoing invention has been described in some 

hours at 37° C, at which time the cultures were visibly detail for purposes of clarity and understanding, it wUl be 

turbid. One ml of each culture was transferred to a new tube appreciated by one skilled in the art from a reading of this 

containing 10 ^1 of 10% (w/v) IPTG to induce expression of 20 disclosure that various changes in form and detail can be 

GST. After 2 hours additional incubation, all cultures bad made without departing from the true scope of the invention 

about the same turbidity; the A600 of one culture was 1.5. and appended claims. All patents and publications cited 

Cells from 0.35 ml each cultiure were harvested and treated herein are entirely incorporated herein by reference. 

SEQUENCE USTWO 

( 1 )C3ENERALINFORMAnON: 

(Mi ) NUMBER OF SEQUENCES: 3S 

( 2 ) INFORMATION FOR SEQ ID N0:1: 

( i ) SEQUENCE CHARACreRISnCS: 

( B ) TYPE: nucleic add'" 
( C ) STRANDEDNESS: both 
( D ) TOPOLOGY: both 

( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

RKYCWOCTTT YKTRTACNAA STSGB 25 

( 2 ) INFORMATION FOR SEQ ID N0:2: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 25 base pairs 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: both 



( i i ) MOLECULE TYPE: cDNA 
( X i ) SEQUENCE DESCRIPTION: SEQ ID N0:2: 
AGCCWGCTTT YKTRTACNAA CTSGB 

( 2 ) INFORMATION FOR SEQ ID N0:3: 



) MOLECULE TYPE; cDNA 



( X i ) SEQUENCE DESCRIPTION: SEQ ID NO:3: 



CNAA CTSGB 



( 2 ) MFORMAnON FOR SEQ ID N0:4: 

( i ) SEQUENCE CHARACreRISTICS: 

( B )TYFE:iiiicleicacid'^ 
( C ) STRANDEDNESS: both 
(D)TOPOLOGYboth 



( 2 ) WFORMAnON FOR SEQ ID N0:5: 

( i ) SEQUENCE CHARACTERISTICS: 



(D)- 

( i i )MOUECULETYre:cDNA 
( X i ) SEQUENCE DESCRIPTION: SEQ ED N0:5: 
GTTCAGCTTT YKTRTACNAA GTSGB 

( 2 ) INFORMATION FOR SEQ ID N0:«: 

( i ) SEQUENCE CHARACTERISTICS: 

( B ) TYPE: nucleic acid*" 
( C ) STRANDEDNESS: both 
( D ) TOPOLOGY: both 

( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ID N0:6: 

AGCCTGCTTT TTTGTACAAA CTTGT 

( 2 ) INFORMATION FOR SEQ ID N0:7: 



i )M01JECUUETYPE:cDNA 

i ) SEQUENCE DESCRIPTION: SEQ ID N0:7: 

CTTT CTTGTACAAA CTTGT 

FOR SEQ ID N0:8: 

) SEQUENCE CHARACTERISTICS: 

( B ) TYPE: nucleic acid 

( C ) STRANDEDNESS: both 

(D)TOPOLOGY:both 

) MOLECULE TYPE: cDNA 

) SEQUENCE DESCRIPTION: SEQ ID N0:8: 

TTT CTTGTACAAA CTTGT 
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( 2 ) INFORMATION FOR SEQ ID N0:9: 
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-continued 

( i ) SEQUENCE CHARACTERISTICS: 
( B ) TYPE: nucleic acid 
( D ) TOPOLOGY: both 
( i i ) MOLECULE TYPE: cDNA 
( X i ) SEQUENCE DESCRIPTION: SEQ ED N0:9: 
GTTCAGCTTT TTTGTACAAA CTTGT 

( 2 ) INFORMATION FOR SEQ ID NO:10: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 25 base pairs 
( B ) TYPE: nucleic acid 
( C ) SHTRANDEDNESS: both 
( D ) TOPOLOGY: both 

( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

GTTCAGCTTT CTTGTACAAA CTTGT 

( 2 ) INFORMATION FOR SEQ ID N0:11: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 25 base pairs 
( B ) TYPE: nndeic add 
( C ) STRANDEDNESS: both 
( D ) TOPOLOGY: both 

( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 

GTTCAGCTTT CTTGTACAAA GTTGG 

( 2 ) INFORMATION FOR SEQ ID N0:12: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 25 base pairs 

( B ) TYPE; nucleic acid 



( D ) TOPOLOGY: both 
( i i ) MOLECULE TYPE: cDNA 
( X i ) SEQUENCE DESCRIPTION: SEQ ID N0:12: 
AGCCTGCTTT TTTGTACAAA GTTGG 

( 2 ) INFORMATION FOR SEQ ID N0:13: 

( i ) SEQUENCE CHARAOnERISnCS: 
( A ) LENGTH: 25 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: both 
( D ) TOPOLOGY: both 

( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ID N0:13: 

AGCCTGCTTT CTTGTACAAA GTTGG 



( 2 ) INFORMATION FOR SEQ ID N0:14: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 25 base pairs 
( B ) TYPE: nucleic add 



( D ) TOPOLOGY: both 
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( i i ) MOLECULE TYTEicDNA 
( X i ) SEQUENCE DESCRIPTION: SEQ ID N0:14: 
ACCCAGCTTT CTTGTACAAA GTTGG 

( 2 ) DWORMAnON FOR SEQ DO N0:15: 

( i ) SEQUENCE CHARACTERISnCS: 
( A ) LENGTH: 25 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: both 
(D)TOPOLOCY:both 

( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ID N0:1S: 

GTTCAGCTTT TTTGTACAAA GTTGG 

( 2 ) INFORMAnON FOR SEQ ID N0:16: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 25 base pairs 
( B ) TYPE: nucleic acid 
( C ) STOANDEDNESS: both 
( D ) TOPOLOGY: both 

( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ID N0:16: 

GTTCAGCTTT CTTGTACAAA GTTGG 

( 2 ) INFORMAnON FOR SEQ ID NO: 17: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH; 39 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: both 
(D)TOPOLOGY:both 

( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ID N0:17: 

CCACCACAAA CGCGTCCATG GAATTAC 

( 2 ) INFORMATION FOR SEQ ID N0:18: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 39 base pairs 
( B )TYPE: nucleic acid 
( C ) STRANDEDNESS: both 
( D ) TOPOLOGY: both 



) SEQUENCE DESCRIPTION; SEQ ID N0:18: 

AAG TCGACGCATG CCGACAGCCT TCCA. 



( 2 ) INFORMATION FOR SEQ ID N0:19: 



) SEQUENCE DESCRIPTION: SEQ ID N0:19: 
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GGCCGATTAC GATATCCCAA CGACCGAAAA CCTGTATTTT CAGGGT 



( 2 ) INFORMATION FOR SEQ ID 
( i ) SEQUENCE 



( B ) TYPE: mideic acid 

( C ) SniANDEDNESS: both 

( D ) TOPOLOGY: liotti 



( X i ) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
CAGGTTTTCG GTCGTTGGGA TATCGTAATC 

( 2 ) INFORMATION FOR SEQ ID N0:21: 



( i i ) MOLECULE TYPE: cDNA 
( X i ) SEQUENCE DESCRIPTION: SEQ ID N0:21: 
GGCCAGATTA CGATATCCCA ACGACCC 



(2): 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 31 base pairs 
( B ) TYPE: nucleic acid 
( C ) STOANDEDNESS: both 
( D ) TOPOLOGY: both 

( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ID N0:22: 

CAGGTTTTCG GTCGTTGGGA TATCGTAATC T 

( 2 ) INFORMAnON FOR SEQ ID NO:23: 



( C ) STRANDEDNESS: both 



) MOLECULE TYPE: cDNA 

) SEQUENCE DESCRIPTION: SEQ ID NO:23 



ACCTGTATT TTCAGGGT 



( 2 ) INFORMATION FOR SEQ ID 



i ) SEQUENCE CI 

( A)LENGTH:32basepaiis 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: both 
(D)TOPOLOGY:both 

i ) MOLECULE TYPE: cDNA 

i ) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

TTCG GTCGTTGGGA TATCGTAATC TT 
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C ) STRANDEDNESS: both 



( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
ACCGTTTACG TGGAC 

( 2 ) INFORMAnON FOR SEQ ID NO:26: 

( i ) SEQUENCE CHARACreRISnCS: 
( A ) USNGTH: 31 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: both 
(D)TOPOLOGY:boOi 

( i i )MOUECUlETyPE:cDNA 

( X i ) SEQUENCE DESCRIPnON: SEQ ID NO:26: 

TCGAGTCCAC GTAAACGGTT CCCACTTATT , 

( 2 ) INFORMATION FOR SEQ ID NO:27: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 39 base pairs 
( B )TYPE: nucleic acid 
( C ) STRANDEDNESS: both 
(D)TOPOLOGY:both 

( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ID NO:27: 

UAUUUUCAGG GUATGGAGAA AAAAATCACT 

( 2 ) MFORMAnON FOR SEQ ID NO:28: 



( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ID NO:28: 



( 2 ) MFORMAnON FOR SEQ ID NO:29: 

( i ) SEQUENCE CHARACIERISTICS: 
( A ) LENGTH: 33 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: both 
( D ) TOPOLOGY: both 

( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ID NO:29: 

UAUUUUCAGG GUATGCCTGT TCTGGAAAAC CGG 



( 2 ) INFORMAnON FOR SEQ ID NO:30: 
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( i i )MOUECULETYPE:cDNA 
( X i ) SEQUENCE DESCRiraON:SaEQ ID NO:30: 
UCCCACUUAU UATTTCAGCC CCAGGGCGGC TTTC 

( 2 ) INFORMAnON FOR SEQ ID N0:31: 

( i ) SEQUENCE CHARACTERISTICS: 
(A)LENGTH:25basepairs 
( B )TVFE: nucleic add 
( C ) STRANDEDNESS: both 
( D ) TOPOLOGY: both 

( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ED N0:31: 

AGCCTGCTTT TTTATACTAA CTTGA 

( 2 ) INFORMAnON FOR SEQ ID NO:32: 

( i ) SEQUENCE CHARACTERISTICS: 
( A )I£NGTH:2S base pairs 
( B ) TYPE: niicleic acid 
( C ) STRANDEDNESS: both 
( D ) TOPOLOGY both 

( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ID NO:32: 

TCAAGTTAGT ATAAAAAAGC AGGCT 

( 2 ) INFORMATION FOR SEQ ID NO:33: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) UBNGTH: 25 base paiis 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: both 
( D ) TOPOLOGY: both 

( i i ) MOLECULE TYPE: cDNA 

( X i ) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

ACAAGTTTGT ACAAAAAAGC AGGCT 

( 2 ) INFORMATION FOR SEQ ID NO:34: 



( i i ) MOLECULE TYPE: cDNA 
( X i ) SEQUENCE DESCRIPTION: SEQ ID NO:34: 
CAAGTTTGT ACAAGAAAGC AGGCT 

2 ) INFORMATION FOR SEQ ID N0:3S: 

( i ) SEQUENCE 01 

( B ) TYPE: mj 
( C )S1 
(D)TC 
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( i i ) MOLECULE TYPE: cDNA 
( X i ) SEQUENCE DESCRIPTION: SEQ ID NO:35: 
ACCACTTTGT ACAAGAAAGC TGGGT 



What is claimed is: 

1. A Vector Donor DNA molecule comprising a first DNA lo 
segment and a second DNA segment, said first or second 
DNA segment containing at least one Selectable marker, 

wherein 

i) said first or second DNA segment is flanked by at 
least a first and a second recombination site; and is 
ii) said first recombination site and said second recombi- 
nation site do not recombine with each other. 

2. The Vector Donor DNA molecule according to claim 1, 
wherein said Selectable marker comprises at least one 
inactive fragment of the Selectable marker, wherein the 20 
inactive fragment reconstitutes a functional Selectable 
marker when recombined across said first or second recom- 
bination site with a futher DNAsegment comprising another 
inactive fragment of the Selectable marker. 

3. The Vector Donor DNA molecule of claim 1, wherein js 
at least one of said recombination sites is derived from at 
least one recombination site selected from the group con- 
sisting of attB, attP, attL, and attR. 

4. The Vector Donor DNAmolecule according to claim 1, 
wherein the Selectable marker comprises at least one DNA 30 
segment selected from the group consisting of: 

(i) a DNA segment that encodes a product that provides 
resistance against otherwise toxic compounds; 

(ii) a DNA segment that encodes a heteolosous product; 

(iii) a DNA segment that encodes a product that sup- 35 
presses the activity of a gene product; 

(iv) a DNA segment that encodes a product that is 
identifiable; 

(v) a DNA segment tat encodes a product that inhibits a 
cell function; ^ 

(vi) a DNA segment that inhibits the activity of any of the 
DNA segments of (i)-(v) above; 

(vii) a DNAsegment that binds a product that modifies a 
substrate; 

(viii) a DNAsegment that provides for the isolation of a 
desired molecule; 

(ix) a DNA segment that encodes a specific nucleotide 
recognition sequence which is recognized by an 
enzyme; and 5q 

(x) a DNAsegment that, when deleted, confers sensitivity 
to cell-killing by particular compounds. 

5. The Vector Donor DNAmolecule according to claim 4, 
wherein said Selectable marker comprises at least one 
marker selected from the group consisting of an antibiotic 55 
resistance gene, a tRNAgene, an auxotrophic marker, a toxic 
gene, a phenotypic marker, an antisense oligonucleotide; a 
restriction endonuclease; a restriction endonuclease cleav- 
age site, an enzyme cleavage site, a protein binding site; and 

a sequence complementary to a PGR primer sequence. gQ 

6. The Vector Donor DNAmolecule according to claim 1, 
wherein said recombination site comprises a DNA sequence 
selected from the group consisting of: 

(a) RKYCWGCTTTYKTRTACNAASTSGB (m-att) 
(SEQ ID N0:1); 65 

(b) AGCCWGCTTTYKTRTACNAACTSGB (m-attB) 
(SEQ ID N0:2); 



(c) GTTCAGCTTTCKTRTACNAACTSGB (m-attR) 
(SEQ ID N0:3); 

(d) AGCCWGCTTTCKTRTACNAAGTSGB (m-attL) 
(SEQ ID N0:4); 

(e) GTTCAGCTTTYKTRTACNAAGTSGB(m-attPl) 
(SEQ ID N0:5); 

and a corresponding or complementary DNA or RNA 
sequence, wherein R=A or G; K=G or TAJ; Y=C or TAJ; 
W=A or TAJ; N=A, C, or G or TAJ; S=C or G; and B=C, G 
or TAJ. 

7. The Vector Donor DNA molecule according to claim 6, 
wherein said DNA sequence comprises a sequence selected 
from the group consisting of: 

(a) AGCCTGCTTTTTTGTACAAACTTGT (attBl) 
(SEQ ID N0:6); 

(b) AGCCTGCTTTCTTGTACAAACTTGT (attB2) 
(SEQ ID N0;7); 

(c) ACCCAGCTTTCTTGTACAAACTTGT (attB3) 
(SEQ ID N0:8); 

(d) GTTCAGCTTTTTTGTACAAACTTGT (attRl) 
(SEQ ID N0:9); 

(e) GTTCAGCTTTCTTGTACAAACTTGT (attR2) 
(SEQ ID NO:10); 

(f) GTTCAGCTTTCTTGTACAAAGTTGG (attR3) 
(SEQ ID NO: 11); 

(g) AGCCTGCTTTTTTGTACAAAGTTGG (attLl) 
(SEQ ID N0:12); 

(h) AGCCTGCTTTCTTGTACAAAGTTGG (attL2) 
(SEQ ID N0:13); 

(i) ACCCAGCTTTCTTGTACAAAGTTGG (attL3) 
(SEQ ID NO: 14); 

(j) GTTCAGCTTTTTTGTACAAAGTTGG(attPl) (SEQ 
ID NO: 15); 

(k) GTTCAGCTTTCTTGTACAAAGTTGG (attP2,P3) 
(SEQ ID NO: 16); 
and a corresponding or complementary DNA or RNA 
sequence. 

8. An Insert Donor DNA molecule, comprising a first 
DNA segment flanked by at least a flrst recombination site 
and a second recombination site, wherein said flrst and 
second recombination sites do not recombine with each 
other. 

9. The Insert Donor DNA molecule according to claim 8, 
wherein said desired DNA segment codes for at least one 
marker selected from the group consisting of a cloning site, 
a restriction site, a promoter, an operon, an origin of 
replication, a functional DNA, an antisense RNA, a PGR 
fragment, a protein and a protein fragment. 

10. The Insert Donor DNAmolecule according to claim 8, 
wherein said recombination site comprises a DNA sequence 
selected from the group consisting of: 

(a) RKYCWGCTTTYKTRTACNAASTSGB (m-att) 
(SEQ ID N0:1); 

(b) AGCCWGCTTTYKTRTACNAACTSGB (m-attB) 
(SEQ ID N0:2); 

(c) GTTCAGCTTTCKTRTACNAACTSGB (m-attR) 
(SEQ ID N0:3); 
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(d) AGCCWGCTTTCKTRTACNAAGTSGB (m-attL) 15. The kit according to claim 14, wherein said DNA 
(SEQ ID N0:4); sequence comprises a sequence selected from the group 

(e) GTTCAGCTTTYKTRTACNAAGTSGB(m-attPl) consisting of: 

(SEQ ID N0:5); (a) AGCCTGCTTTTTTGTACAAACTTGT (attBl) 

and a corresponding or complementary DNA or RNA 5 (SEQ ID N0:6); 

sequence, wherein R=A or G; K=G or TAJ; Y=C or TAJ; 0>) AGCCTGCTTTCTTGTACAAACTTGT (attB2) 

W=A or T/U; N=A C, or G or TAJ; S=C or G; and B=C, G (^EQ ID N0:7); 

or TAJ. (c) ACCCAGCTTTCTTGTACAAACTTGT (attB3) 

11. The Insert Donor DNA molecule according to claim (SEQ ID N0:8); 

10, wherein said DNA sequence comprises a sequence 10 (d) GTTCAGCTTTTTTGTACAAACTTGT (attRl) 

selected from the group consisting of: (SEQ ID NO:9); 

(a) AGCCTGCTTTTTTGTACAAACTTGT (attBl) (e) GTTCAGCTTTCTTGTACAAACTTGT (attR2) 
(SEQ ID N0:6); (SEQ ID NO:10); 

(b) AGCCTGCTTTCTTGTACAAACTTGT (attB2) (0 GTTCAGCTTTCTTGTACAAAGTTGG (attR3) 
(SEQ ID N0:7); " (SEQ ID NO:ll); 

(c) ACCCAGCTTTCTTGTACAAACTTGT (attB3) ^s) AGCCTGCTTTTTTGTACAAAGTTGG (attLl) 
(SEQ ID N0:8); (^^^ NO: 12); 

(d) GTTCAGCTTnTTGTACAAACrrGT (attRl) (h) AGCCTGCTTTCTTGTACAAAGTTGG (attL2) 

rsFO in NO-9^- 70 ^ ^ iNu.i:*;, 

(e) G^TCAGCtVtCTTGTACAAACTTGT (attR2) (%t5^£^«o'?^^™^^^«^°« (^"L^) 

(f) ^'G?TrG°CTTVcTT Q) GTTCAGCTTTfrTGTACAAAGTrGG(attPl) (SEQ 
(SEQ ID NO: 11); .j^x GTrCAGCTTTCTTGTACAAAGTrGG (antP2,P3) 

(g) AGCCTGCTTTTTTGTACAAAGTTGG (attLl) (SEQ ID NO: 16); 

(SEQ ID NO: 12); and a corresponding or complementary DNA or RNA 

(h) AGCCTGCTTTCTTGTACAAAGTTGG (attL2) sequence. 

(SEQ ID NO: 13); 16. A recombinant nucleic acid molecule, comprising at 

(i) ACCCAGCTTTCTTGTACAAAGTTGG (attL3) so least one DNA segment comprising at least a first and a 
(SEQ ID N0:14); second recombination site flanking a Selectable marker or at 

0) GTTCAGCTmTTGTACAAAGTrGG(attPl) (SEQ ^^^^ desired DNA segment, wherein at least one of said 

ID NO- 15)- ^^^^ second recombination sites comprises a core 

(k) GTrCAGCnTCrTGTACAAAGTTGG (attP2,P3) region that enhances recombination efficiency or specifi^^^^ 
(SEQ ID NO-16)- 35 m vitro m the formation of a Comtegrate DNA or a Product 

and a corresponding or complementary DNA or RNA P.^A. and wherein said first and second sites do not recom- 

seauence " " " bme with each other. 

12. A kit comprising at least one Vector Donor DNA A "imposition, comprising the recombinant nucleic 

. . 7, o ^.r. , acid molecule accordms to claim 16. and a earner, 

molecule composing at least a first DNA segment and a " "^"^'""^s lu ^laim x», <mu a i,<ui«ii. 

second DNA segment, said first or second DNA segment 40 18- The recombinant nucleic acid molecule accordmg to 

containing at least one Selectable marker, wherein saidfirst ^^^^ ^^^^^"^ recombmation sites confer at least 

or second DNA segment is flanked by at least a first and a enhancement selected from the group consisting of (i) 



second recombination site, that do not recombine with each enhancing excisive recombination; (ii) enhancing integra- 



other. 



tive recombination; (iii) decreasing the requirement for host 



13. The kit according to claim 12, fiirther comprising at « Actors; (iv) increasmg the efficiency of the formation reac- 
least one Insert Donor DNA molecule comprising a desfred VnM f /T Comtegrate DNA or of said 
DNA segment flanked by at least a first recombination site ^^A; (v) increasing the specifiaty of the fonnation 
and a second recombination site that do not recombine with '^^f^"" t,y recombmation of said Comtegrate DNA or of 
each other ^ Product DNA; and (vi) increasing the speaflcity or 

14. The kit according to claim 12, wherein said recom- 50 vield of a subsequent recombination reaction of, or subse- 
bination site comprises a DNA sequence selected from the q^^ent ^lation of the Product DNA. 

• J. £ 19. The recombinant nucleic acid molecule according to 

(a) RKYCWG°CTTTYKTRTACNAASTSGB (m-att) ^e ™ itSnl'of 'iif '^tS tnd'^^I^"' 
/-cpn in NO n ^ P consisting ot attl, att2 and attS. 

(^stu lu 20. The recombinant nucleic acid molecule according to 

(b) AGCCWGCTTTYKTRTACNAACTSGB (m-attB) claim 16, wherein said at least one of said recombination 
(SEQ ID N0:2); sjtes is from at least one att recombination site. 

(c) GTTCAGCTTTCKTRTACNAACTSGB (m-attR) 21. The recombinant nucleic acid molecule according to 
(SEQ ID N0:3); claim 20, wherein the att site is at least one selected from the 

(d) AGCCWGCTTTCKTRTACNAAGTSGB (m-attL) so group consisting of attB, attP, attL and attR. 

(SEQ ID N0:4); 22. The recombinant nucleic acid according to claim 16, 

(e) GTTCAGCTTTYKTRTACNAAGTSGB(m-altPl) wherein said core region comprises a DNA sequence 
(SEQ ID N0:5); selected from the group consisting of: 

and a corresponding or complementary DNA or RNA (a) RKYCWGCTTTYKTRTACNAASTSGB(m-att) 

sequence, wherein R=A or G; K=G or TAJ; Y=C or TAJ; 65 (SEQ ID N0:1); 

W=A or TAJ; N=A, C, or G or T/U; S=C or G; and B=C, G (b) AGCCWGCTTTYKTRTACNAACTSGB (m-attB) 

or TAJ. (SEQ ID N0:2); 
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(c) GTTCAGCTTTCKTRTACNAACTSGB (m-attR) 
(SEQ ID NO: 3); 

(d) AGCCWGCTTTCKTRTACNAAGTSGB (m-attL) 
(SEQ ID NO: 4); 

(e) GTTCAGCTTTYKTRTACNAAGTSGB(m-attPl) 
(SEQ ID N0:5); 

and a corresponding or complementary DNA or RNA 
sequence, wherein R=A or G; K=G or T/U; Y=C or T/U; 
W=A or TAJ; N=A or C or G or T/U; S=Cor G; and B=C or 
G or TAJ. 

23. The recombinant nucleic acid according to claim 22, 
wherein said core region comprises a DNA sequence 
selected from the group consisting of: 

(a) AGCCTGCTTTTTTGTACAAACTTGT (attBl) 
(SEQ ID N0:6); 

(b) AGCCTGCTTTCTTGTACAAACTTGT (attB2) 
(SEQ ID N0:7); 

(c) ACCCAGCTTTCTTGTACAAACTTGT (attB3) 
(SEQ ID N0:8); 

(d) GTTCAGCTTTTTTGTACAAACTTGT (atlRl) 
(SEQ ID N0:9); 

(e) GTTCAGCTTTCTTGTACAAACTTGT (attR2) 
(SEQ ID NO: 10); 

(f) GTTCAGCTTTCTTGTACAAAGTTGG (attR3) 
(SEQ ID NO: 11); 

(g) AGCCTGCTTTTTTGTACAAAGTTGG (attLl) 
(SEQ ID NO: 12); 

(h) AGCCTGCTTTCTTGTACAAAGTTGG (attL2) 
(SEQ ID N0:13); 

(i) ACCCAGCTTTCTTGTACAAAGTTGG (attL3) 
(SEQ ID N0:14); 

0) GTTCAGCTTTTTTGTACAAAGTrGG(attPl) (SEQ 
ID N0:15); 

(k) GTTCAGCTnCTTGTACAAAGTTGG (attP2,P3) 
(SEQ ID NO: 16); 
and a corresponding or complementary DNA or RNA 
sequence. 

24. A kit, comprising the recombinant nucleic acid 
according to claim 16. 

25. The kit according to claim 24, fiirther comprising at 
least one recombination protein that recognizes at least one 
of said recombination sites. 

26. A recombinant nucleic acid molecule, comprising 

at least one recombination site comprising at least one 
nucleic acid sequence having at least one of SEQ ID 
N0S:1-16, or a complementay DNA sequence or a 
corresponding RNA sequence. 

27. The method according to claim 26, wherein said 
nucleic acid sequence is selected from the group consisting 
of: 

(a) RKYCWGCTTTYKTRTACNAASTSGB (m-att) 
(SEQ ID N0:1); 

(b) AGCCWGCTTTYKTRTACNAACTSGB (m-attB) 
(SEQ ID N0:2); 

(c) GTTCAGCTTTCKTRTACNAACTSGB (m-attR) 
(SEQ ID N0:3); 

(d) AGCCWGCTTTCKTRTACNAAGTSGB (m-attL) 
(SEQ ID N0:4); 

(e) GTTCAGCTTTYKTRTACNAAGTSGB(m-attPl) 
(SEQ ID N0:5); 

and a corresponding or complementary DNA or RNA 
sequence, wherein R=A or G; K=G or TAJ; Y=C or T/U; 
W=A or T/U; N=A, C, G or T/U; S=C or G; and B=C, G or 
TAJ. 



28. The method according to claim 27, wherein said 
nucleic acid sequence is selected from the group consisting 
of: 

(a) AGCCTGCTTTTTTGTACAAACTTGT (attBl) 
5 (SEQ ID N0:6); 

(b) AGCCTGCTTTCTTGTACAAACTTGT (attB2) 
(SEQ ID NO:7); 

(c) ACCCAGCTTTCTTGTACAAACTTGT (attB3) 
(SEQ ID N0:8); 

10 (d) GTTCAGCTTTTTTGTACAAACTTGT (attRl) 
(SEQ ID N0:9); 

(e) GTTCAGCTTTCTTGTACAAACTTGT (attR2) 
(SEQ ID NO:10); 

(f) GTTCAGCTTTCTTGTACAAAGTTGG (attR3) 
15 (SEQ ID NO: 11); 

(g) AGCCTGCTTTTTTGTACAAAGTTGG (attLl) 
(SEQ ID NO: 12); 

(h) AGCCTGCTTTCTTGTACAAAGTTGG (attL2) 
(SEQ ID N0:13); 

^° (i) ACCCAGCTTTCTTGTACAAAGTTGG (attL3) 
(SEQ ID NO: 14); 
(j) GTTCAGCTTTTTTGTACAAAGTTGG(attPl) (SEQ 
ID N0:15); 

25 (k) GTTCAGCTrrCTTGTACAAAGTTGG (attP2,P3) 
(SEQ ID NO: 16); 
and a corresponding or complementary DNA or RNA 
sequence. 

29. A method of making a Cointegrate DNA molecule, 
3Q comprising combining in vitro. 

(i) an Insert Donor DNA molecule, comprising a desired 
DNA segment flanked by a first recombination site and 
a second recombination site, wherein the first and 
second recombination sites do not recombine with each 

35 other; 

(ii) a Vector Donor DNA molecule containing a third 
recombination site and a fourth recombination site, 
wherein the third and fourth recombination sites do not 
recombine with each other; and 

40 (iii) at least one site specific recombination protein 
capable of recombining said first and third recombina- 
tional sites said second and fourth recombinational 
sites; 

thereby allowing recombination to occur, so as to produce a 
45 Cointegrate DNA molecule comprising said first and third or 
said second and fourth recombination sites. 

30.. The method as claimed in claim 29, wherein the 
Vector Donor DNA molecule comprises a vector segment 
flanked by said third and said fourth recombination sites. 
50 31. The method as claimed in claim 29, wherein the 
Vector Donor DNA molecule further comprises (a) a toxic 
gene and (b) a Selectable marker, wherein said toxic gene 
and said Selectable marker are on different DNA segments, 
the DNA segments being separated from each other by at 
55 least two recombination sites. 

32. The method as claimed in claim 29, wherein the 
Vector Donor DNA molecule further comprises (a) a repres- 
sion cassette and (b) a Selectable marker that is repressed by 
the repressor encoded by said repression cassette, and 

60 wherein the Selectable marker and the repression cassette 
are on different DNA segments, the DNA segments being 
separated from each other by at least two recombination 

33. The method as claimed in claim 29, wherein at least 
65 one of said Insert Donor DNA molecule and said Vector 

Donor DNA molecule is comprised of a circular DNA 
molecule. 
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34. The method as claimed in claim 29, wherein at least 
one of said Insert Donor DNA molecule and said Vector 
Donor DNA molecule is comprised of a linear DNA mol- 

35. The method of claim 29, further comprising the step 

of 

producing a Product DNA molecule from said Cointegrate 
DNA by recombining at least one of (i) said first and 
third, or (ii) said second and fourth, recombination 
sites, said Product DNA comprising said desired DNA 
segment. 

36. The method according to claim 35, wherein said 
method also produces a Byproduct DNA molecule, wherein 
said Byproduct DNA molecule does not contain said desired 
DNA segment and is produced with said Product DNA. 

37. The method according to claim 35, further comprising 
the step of selecting the Product DNA molecule. 

38. A recombinant nucleic acid molecule comprising at 
least a first and a second recombination site flanking at least 
one DNA segment containing at least one Selectable marker, 
wherein said first and second recombination sites do not 
recombine with each other. 

39. The recombinant nucleic acid molecule of claim 38, 
wherein said selectable marker is selected from the group 
consisting of: 

(i) a DNA segment that encodes a product that provides 
resistance against otherwise toxic compounds; 

(ii) a DNA segment that encodes a heterologous product; 

(iii) a DNA segment that encodes a product that sup- 
presses the activity of a gene product; 

(iv) a DNA segment that encodes a product that is 
identifiable; 

(v) a DNA segment that encodes a product that inhibits a 
cell function; 

(vi) a DNA segment that inhibits the activity of any of the 
DNA segments of (i) to (v) above; 

(vii) a DNA segment that binds a product that modifies a 
substrate; 

(viii) a DNA segment that provides for isolation of a 
desired molecule; 

(ix) a DNA segment that encodes a specific nucleotide 
recognition sequence which is recognized by an 
enzyme; and 

(x) a DNA segment that, when deleted, confers sensitivity 
to cell killing by a particular compound. 

40. The recombinant nucleic acid molecule of claim 38, 
wherein said selectable marker is selected from the group 
consisting of an antibiotic resistance gene, a tRNA gene, an 
auxotrophic marker, a toxic gene, a phenotypic marker, an 
antisense oligonucleotide, a restriction endonuclease, a 
restriction endonuclease cleavage site, an enzyme cleavage 
site, a protein binding site, and a sequence complimentary to 
a PGR primer sequence. 

41. A kit comprising the recombinant nucleic acid mol- 
ecule of claim 38. 

42. The recombinant nucleic acid according to claim 38, 
wherein said DNA segment comprises a cloning site. 



43. The recombinant nucleic acid according to claim 42, 
wherein said nucleic acid contains at least one restriction 
enzyme site at said cloning site. 

44. The recombinant nucleic acid according to claim 38, 
5 wherein said DNAsegment further comprises an insert DNA 

molecule. 

45. The recombinant nucleic acid according to claim 44, 
wherein said Insert DNA molecule codes for at least one 
marker selected from the group consisting of a restriction 

10 site, a promoter, an operon, an origin of replication, a 
functional DNA, an antisense RNA, a PGR fragment, a 
protein or a protein fragment. 

46. The molecule according to claim 38, wherein said 
recombination site comprises a DNA sequence selected from 

IS the group consisting of: 

(a) RKYGWGGTTTYKTRTACNAASTSGB (m-att) 
(SEQ ID N0:1); 

(b) AGCGWGGTTTYKTRTAGNAAGTSGB (m-attB) 
(SEQ ID N0:2); 

2° (c) GTTCAGCTTTCKTRTACNAACTSGB (m-attR) 
(SEQ ID N0:3); 

(d) AGCCWGCTTTCKTRTACNAAGTSGB (m-attL) 
(SEQ ID N0:4); 

(e) GTTCAGCTTTYKTRTACNAAGTSGB(m-attPl) 
(SEQ ID N0:5); 

and a corresponding or complementary DNA or RNA 
sequence, wherein R=A or G; K=G or TAJ; Y=C or T/U; 
W= A or T/U; N=A, C, G or T/U; S=C or G; and B=C, G or 
30 T/U. 

47. The molecule according to claim 46, wherein said 
DNA sequence comprises a sequence selected from the 
group consisting of: 

(a) AGCCTGCTTTTTTGTACAAACTTGT (attBl) 
35 (SEQ ID N0:6); 

(b) AGCCTGCTTTCTTGTACAAACTTGT (attB2) 
(SEQ ID N0:7); 

(c) ACCCAGCTTTCTTGTACAAACTTGT (attB3) 
(SEQ ID N0:8); 

40 (d) GTTCAGCTTTTTTGTACAAACTTGT (attRl) 
(SEQ ID N0:9); 
(e) GTTCAGCTTTCTTGTACAAACTTGT (attR2) 

(SEQ ID NOrlO); 
(0 GTTCAGCTTTCTTGTACAAAGTTGG (attR3) 
45 (SEQ ID NO: 11); 

(g) AGCCTGCTTTTTTGTACAAAGTTGG (attLl) 
(SEQ ID N0:12); 

(h) AGCCTGCTTTCTTGTACAAAGTTGG (attL2) 
(SEQ ID N0:13); 

^° (i) ACCCAGCTTTCTTGTACAAAGTTGG (attL3) 
(SEQ ID NO: 14); 
(j) GTTCAGCTTTTTTGTACAAAGTTGG(attPl) (SEQ 
ID N0:15); 

55 (k) GTTCAGCmCTTGTACAAAGTTGG (attP2,P3) 
(SEQ ID N0:16); 
and a corresponding or complementary DNA or RNA 
sequence. 
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